In Silico Biology - Volume 2, issue 3 - Journals

Show:

results per page

Computer System Gene Discovery for Promoter Structure Analysis

Authors: Vityaev, Eugenii E. | Orlov, Yury L. | Vishnevsky, Oleg V. | Pozdnyakov, Mikhail A. | Kolchanov, Nikolay A.

Article Type: Short Communication

Abstract: This paper presents implementation of Data Mining and Knowledge Discovery techniques for search-ing for regularities in tables of context features of DNA sequences involved in regulation of transcription. The goal is to discover regularities that relate nucleotide sequences to the functional classes of these sequences. The search patterns for regularities have been constructed in the first-order logic augmented by probabilistic estimates. To this aim, the PC software system Gene Discovery has been …designed. This system accepts molecular-genetical data retrieved from a database by using SQL queries. Nucleotide sequences of promoters of several functional systems were extracted from the TRRD database (http://wwwmgs.bionet.nsc.ru/mgs/gnw/trrd/) and analysed. The data in-clude nucleotide sequences of erythroid-specific gene promoters, endocrine system gene promoters, promoter regions of the genes controlling cell cycle, promoter of genes regulating lipid metabolism, and muscle-specific gene promoters. Several regularities that relate the nucleotide sequences in the regulatory DNA and their location relative to the transcription start with each functional class have been found. Show more

Keywords: Machine learning, knowledge discovery, data mining, bioinformatics, eukaryotic promoter recognition, transcription factors binding sites, oligonucleotide patterns

Citation: In Silico Biology, vol. 2, no. 3, pp. 257-262, 2002

Price: EUR 27.50

Mining Putative Regulatory Elements in Promoter Regions of Saccharomyces Cerevisiae

Authors: Horng, Jorng-Tzong | Huang, Hsien-Da | Huang, Shir-Ly | Yang, Ueng-Cheng | Chang, Yu-Chang

Article Type: Research Article

Abstract: The availability of genome-wide gene expression data provides a unique set of genes from which we can decipher the mechanisms underlying the common transcriptional response. Transcription factors, which can bind to specific DNA sites, cooperatively regulate the transcription of genes. This study attempts to mine putative binding sites to investigate how combinations of the sites predicted from known sites and over-represented repetitive elements are distributed in the promoter regions of groups of functionally …related genes. The over-represented repetitive elements appearing in the associations are possible transcription factor binding sites. The deduced association rules would facilitate to predict putative regulatory elements and to identify genes which are potentially co-regulated by the putative regulatory elements. Our proposed approach is applied to Saccharomyces cerevisiae and the promoter regions of yeast ORFs. Show more

Keywords: regulatory elements, repetitive oligonucleotide, data mining, promoter

Citation: In Silico Biology, vol. 2, no. 3, pp. 263-273, 2002

Price: EUR 27.50

Protein Similarity Search under mRNA Structural Constraints: Application to Targeted Selenocysteine Insertion

Authors: Backofen, Rolf | Narayanaswamy, N.S. | Swidan, Firas

Article Type: Research Article

Abstract: Selenocysteine is the 21th amino acid, which occurs in all kingdoms of life. Selenocysteine is en-coded by the STOP-codon UGA. For its insertion, it requires a specific mRNA sequence downstream the UGA-codon that forms a hairpin like structure (called Sec insertion sequence (SECIS). We consider the computational problem of generating new amino acid sequences containing selenocysteine. This requires to find an mRNA se-quence that is similar to the SECIS-consensus, is able to form the secondary structure …required for selenocysteine insertion, and whose translation is maximally similar to the original amino acid sequence. We show that the problem can be solved in linear time when considering the hairpin-like SECIS-structure (and, more generally, when consider-ing a structure that does not contain pseudoknots). Show more

Keywords: selenocysteine, SECIS, protein engineering

Citation: In Silico Biology, vol. 2, no. 3, pp. 275-290, 2002

Price: EUR 27.50

An Overview on Predicting the Subcellular Location of a Protein

Authors: Feng, Zhi-Peng

Article Type: Research Article

Abstract: The present paper overviews the issue on predicting the subcellular location of a protein. Five meas-ures of extracting information from the global sequence based on the Bayes discriminant algorithm are reviewed. 1) The auto-correlation functions of amino acid indices along the sequence; 2) The quasi-sequence-order approach; 3) the pseudo-amino acid composition; 4) the unified attribute vector in Hilbert space, 5) Zp parameters extracted from the Zp curve. The actual performance of the predictive accuracy is closely …related to the degree of similarity be-tween the training and testing sets or to the average degree of pairwise similarity in dataset in a cross-validated study. Many scholars considered that the current higher predictive accuracy still cannot ensure that some available algorithms are effective in practice prediction for the higher pairwise sequence identity of the datasets, but some of them declared that construction of the dataset used for developing software should base on the reality determined by the Mother Nature that some subcellular locations really contain only a minor number of proteins of which some even have a high percentage of sequence similarity. Owing to the complexity of the problem itself, some very so-phisticated and special programs are needed for both constructing dataset and improving the prediction. Anyhow finding the target information in mature protein sequence and properly cooperating it with sorting signals in predic-tion may further improve the overall predictive accuracy and make the prediction into practice. Show more

Keywords: subcellular location, N-terminal targeting sequences, sorting signals, targeting information, amino acid composition, quasi-sequence-order-effect, pseudo-amino acid composition, auto-correlation functions, unified attribute vector, Zp curve, Zp parameters, Bayes discriminant algorithm, component-coupled algorithm, k-nearest neighbor method, hidden Markov model, neural networks, Support Vector Machine (SVM), jackknife test, hydro-phobicity, pairwise sequence similarity

Citation: In Silico Biology, vol. 2, no. 3, pp. 291-303, 2002

Price: EUR 27.50

Molecular Dynamics Simulations on the Free and Complexed N-Terminal SH2 Domain of SHP-2

Authors: Wieligmann, Karin | De Castro, Luis Felipe Pineda | Zacharias, Martin

Article Type: Other

Abstract: ABSTRACT: Signal transduction events are often mediated by small protein domains such as SH2 (Src homology 2) domains that recognize phosphotyrosines (pY) and flanking sequences. In case of the SHP-2 receptor tyrosine phosphatase an N-terminal SH2 domain binds and inactivates the phosphatase (PTP) domain. The pY-peptide- binding site on the N-terminal SH2 domain does not overlap with the PTP binding region. Nevertheless, pY-peptide binding causes domain dissociation and phosphatase activation. Comparative multi-nanosecond …molecular dynam-ics simulations on the N-SH2 domain in ligand-bound and free states have been performed to study the allosteric mechanism that leads to domain dissociation upon pY-peptide binding. Significant ligand-dependent differences in the conformational flexibility of regions that are involved in SH2-PTP domain association have been observed. The results support a mechanism of signal transduction where SH2-peptide binding modulates the domain flexibility and reduces its capacity to fit into the entrance of the PTP catalytic domain of SHP-2. Show more

Keywords: allosteric conformational change, , signal transducution, ligand-receptor binding, molecular dynamics, SH2 domains, SHP-2 phosphatase, conformational flexibility

Citation: In Silico Biology, vol. 2, no. 3, pp. 305-311, 2002

Get PDF

ProML - The Protein Markup Language for Specification of Protein Sequences, Structures and Families

Authors: Hanisch, Daniel | Zimmer, Ralf | Lengauer, Thomas

Article Type: Research Article

Abstract: We propose a specification language ProML for protein sequences, structures, and families based on the open XML standard. The language allows for portable, system-independent, machine-parsable and human-readable representation of essential features of proteins. The language is of immediate use for several bioinformatics applications: we discuss clustering of proteins into families and the representation of the specific shared features of the respective clusters. Moreover, we use ProML for specification of data used in …fold recognition bench-marks exploiting experimentally derived distance constraints. Show more

Keywords: Protein Markup Language, ProML, XML, protein properties, protein families, protein structures, distance constraints, protein clusters

Citation: In Silico Biology, vol. 2, no. 3, pp. 313-324, 2002

Price: EUR 27.50

Improving Fold Recognition of Protein Threading by Experimental Distance Constraints

Authors: Albrecht, Mario | Hanisch, Daniel | Zimmer, Ralf | Lengauer, Thomas

Article Type: Research Article

Abstract: We present a comprehensive analysis of methods for improving the fold recognition rate of the threading approach to protein structure prediction by the utilization of few additional distance constraints. The distance constraints between protein residues may be obtained by experiments such as mass spectrometry or NMR spectroscopy. We applied a post-filtering step with new scoring functions incorporating measures of constraint satisfaction to ranking lists of 123D threading alignments. The detailed analysis of the …results on a small representative benchmark set show that the fold recognition rate can be improved significantly by up to 30% from about 54%-65% to 77%-84%, approaching the maximal attainable performance of 90% estimated by structural superposition alignments. This gain in performance adds about 10% to the recognition rate already achieved in our previous study with cross-link constraints only. Additional recent results on a larger benchmark set involving a confidence function for threading predictions also indicate notable improvements by our combined approach, which should be particularly valuable for rapid structure determination and validation of protein models. Show more

Keywords: protein threading, fold recognition, structure prediction, experimental data, distance constraints, cross-linking reagents, mass spectrometry, NOE restraints, NMR

Citation: In Silico Biology, vol. 2, no. 3, pp. 325-337, 2002

Price: EUR 27.50

A Hypergraph-Based Method for Unification of Existing Protein Structure- and Sequence-Families

Authors: Freudenberg, Jan | Zimmer, Ralf | Hanisch, Daniel | Lengauer, Thomas

Article Type: Research Article

Abstract: Classification of proteins is a major challenge in bioinformatics. Here an approach is presented, that unifies different existing classifications of protein structures and sequences. Protein structural domains are repre-sented as nodes in a hypergraph. Shared memberships in sequence families result in hyperedges in the graph. The presented method partitions the hypergraph into clusters of structural domains. Each computed cluster is based on a set of shared sequence family memberships. Thus, the clusters put existing …protein sequence families into the context of structural family hierarchies. Conversely, structural domains are related to their sequence family member-ships, which can be used to gain further knowledge about the respective structural families. Show more

Keywords: sequence analysis, structure analysis, domain boundary delineation, protein databases, protein homology, protein structure prediction, threading, template selection, optimization, protein clustering

Citation: In Silico Biology, vol. 2, no. 3, pp. 339-349, 2002

Price: EUR 27.50

Comparing Bound and Unbound Protein Structures Using Energy Calculation and Rotamer Statistics

Authors: Koch, Kerstin | Zöllner, Frank | Neumann, Steffen | Kummert, Franz | Sagerer, Gerhard

Article Type: Research Article

Abstract: Protein data in the PDB covers only a snapshot of a protein structure. For flexible docking confor-mational changes need to be considered. Rotamer statistics provide the likelihood for side chain conformations, and further comparison of bound and unbound state yields differences in preferred positions. Furthermore, we do a full sampling of selected angles and apply the AMBER force field. Conformation of energy minima complies with the rotamer statistics. Both types of information target the reduction …of search space for enumerative docking algo-rithms and provide parameters for elastic docking. Show more

Keywords: Rotamer library, flexible protein-protein docking, energy calculations, AMBER force field, side chain flexibility, flexibility measure

Citation: In Silico Biology, vol. 2, no. 3, pp. 351-368, 2002

Price: EUR 27.50

Prediction and Uncertainty in the Analysis of Gene Expression Profiles

Article Type: Research Article

Abstract: We have developed a complete statistical model for the analysis of tumor specific gene expression profiles. The approach provides investigators with a global overview on large scale gene expression data, indicating aspects of the data that relate to tumor phenotype, but also summarizing the uncertainties inherent in classification of tumor types. We demonstrate the use of this method in the context of a gene expression profiling study of 27 human breast cancers. The study is aimed …at defining molecular characteristics of tumors that reflect estrogen receptor status. In addition to good predictive performance with respect to pure classification of the expression profiles, the model also uncovers conflicts in the data with respect to the classification of some of the tumors, highlighting them as critical cases for which additional investigations are appropriate. Show more

Keywords: Computational diagnostics, gene expression analysis, expression profiles, micro array, gene chip, breast cancer, estrogen receptor status, Bayesian statistics, Bayesian regularization, binary regression, probit model, G-prior, singular value decomposition, predictive diagnosis, prognosis, tumor classification, uncertainty, factor regression, ridge regression, machine learning

Citation: In Silico Biology, vol. 2, no. 3, pp. 369-381, 2002

Price: EUR 27.50

Display: 10 | 50 | 100 items per page

In Silico Biology - Volume 2, issue 3

Computer System Gene Discovery for Promoter Structure Analysis

Mining Putative Regulatory Elements in Promoter Regions of Saccharomyces Cerevisiae

Protein Similarity Search under mRNA Structural Constraints: Application to Targeted Selenocysteine Insertion

An Overview on Predicting the Subcellular Location of a Protein

Molecular Dynamics Simulations on the Free and Complexed N-Terminal SH2 Domain of SHP-2

ProML - The Protein Markup Language for Specification of Protein Sequences, Structures and Families

Improving Fold Recognition of Protein Threading by Experimental Distance Constraints

A Hypergraph-Based Method for Unification of Existing Protein Structure- and Sequence-Families

Comparing Bound and Unbound Protein Structures Using Energy Calculation and Rotamer Statistics

Prediction and Uncertainty in the Analysis of Gene Expression Profiles

North America

Europe

Asia