A Naive Bayes Classifier for Protein Function Prediction

Kohonen, Jukka; Talikota, Sarish; Corander, Jukka; Auvinen, Petri; Arjas, Elja

doi:10.3233/ISB-2009-0382

A Naive Bayes Classifier for Protein Function Prediction

Article type: Research Article

Authors: Kohonen, Jukka | Talikota, Sarish | Corander, Jukka | Auvinen, Petri | Arjas, Elja

Affiliations: Department of Mathematics and Statistics, University of Helsinki, Helsinki, FI-00014, Finland | Institute of Biotechnology, University of Helsinki, Helsinki, FI-00014, Finland

Note: [] Corresponding author. E-mail: [email protected]

Abstract: A Naive Bayes classifier tool is presented for annotating proteins on the basis of amino acid motifs, cellular localization and protein-protein interactions. Annotations take the form of posterior probabilities within the Molecular Function hierarchy of the Gene Ontology (GO). Experiments with the data available for yeast, Saccharomyces cerevisiae, show that our prediction method can yield a relatively high level of accuracy. Several apparent challenges and possibilities for future developments are also discussed. A common approach to functional characterization is to use sequence similarities at varying levels, by utilizing several existing databases and local alignment/identification algorithms. Such an approach is typically quite labor-intensive when performed by an expert in a manual fashion. Integration of several sources of information is in this context generally considered as the only possibility to obtain valuable predictions with practical implications. However, some improvements in the prediction accuracy of the molecular functions, and thereby also savings in the computational effort, can be achieved by restricting attention to only those data sources that involve a higher degree of specificity. We employ here a Naive Bayes model in order to provide probabilistic predictions, and to enable a computationally efficient approach to data integration.

Keywords: Protein function prediction, Naive Bayes, data integration, Gene Ontology

DOI: 10.3233/ISB-2009-0382

Journal: In Silico Biology, vol. 9, no. 1-2, pp. 23-34, 2009

Received 9 November 2007

Accepted 24 November 2008

Published: 2009

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia