Affiliations: Institut für Humangenetik,
Universitätsklinikum Bonn, Bonn, Germany E-mail:
[email protected] | Institut für Informatik, LMU, München,
Germany E-mail: [email protected] | FhG-SCAI, Schloss Birlinghoven, St. Augustin, Germany
E-mail: [email protected] | MPI für Informatik, Saarbrücken, Germany,
E-mail: [email protected]
Abstract: Classification of proteins is a major challenge in bioinformatics.
Here an approach is presented, that unifies different existing classifications
of protein structures and sequences. Protein structural domains are
repre-sented as nodes in a hypergraph. Shared memberships in sequence families
result in hyperedges in the graph. The presented method partitions the
hypergraph into clusters of structural domains. Each computed cluster is based
on a set of shared sequence family memberships. Thus, the clusters put existing
protein sequence families into the context of structural family hierarchies.
Conversely, structural domains are related to their sequence family
member-ships, which can be used to gain further knowledge about the respective
structural families.
Keywords: sequence analysis, structure analysis, domain boundary delineation, protein databases, protein homology, protein structure prediction, threading, template selection, optimization, protein clustering