Affiliations: Fraunhofer Institute for Algorithms and Scientific
Computing (SCAI),Schloss Birlinghoven, D-53754 Sankt Augustin, Germany | Institut für Informatik, LMU München,
Theresienstraße 39, D-80333 München, Germany | Max-Planck Institut für Informatik,
Stuhlsatzenhausweg 85, D-66123 Saarbruecken, Germany
Abstract: We propose a specification language ProML for protein sequences,
structures, and families based on the open XML standard. The language allows
for portable, system-independent, machine-parsable and human-readable
representation of essential features of proteins. The language is of immediate
use for several bioinformatics applications: we discuss clustering of proteins
into families and the representation of the specific shared features of the
respective clusters. Moreover, we use ProML for specification of data used in
fold recognition bench-marks exploiting experimentally derived distance
constraints.
Keywords: Protein Markup Language, ProML, XML, protein properties, protein families, protein structures, distance constraints, protein clusters