Affiliations: National Centre for Biological Sciences, UAS-GKVK
Campus, Bellary Road, Bangalore 560 065, India | Department of Computer Science, Indian Institute of
Technology, Kanpur, India Present address: University of California, San Diego,
USA
Abstract: We present an algorithm to detect remote homology, which arises
through circular permutation and discontinuous domains. It is also helpful in
detecting small domain proteins that are characterized by few conserved
residues. The input to the algorithm is a set of multiply aligned protein
sequence profiles. This method, coded as FASSM, examines the sequence
conservation and positions of protein family signatures or motifs for the
annotation of protein sequences and to facilitate the analysis of their
domains. The overall coverage of FASSM is 93% in comparison to other validation
tools like HMM and IMPALA. The method is especially useful for difficult
relationships such as discontinuous domains during whole-genome surveys and is
demonstrated to perform accurate family associations at sequence identities as
low as 15%. Availability: Available upon request from the authors.
Keywords: Function annotation, genome databases, protein subfamily, superfamily, function prediction