Purchase individual online access for 1 year to this journal.
Price: EUR N/A
ISSN 1386-6338 (P)
ISSN 1434-3207 (E)
In Silico Biology is a scientific research journal for the advancement of computational models and simulations applied to complex biological phenomena. We publish peer-reviewed leading-edge biological, biomedical and biotechnological research in which computer-based (i.e.,
) modeling and analysis tools are developed and utilized to predict and elucidate dynamics of biological systems, their design and control, and their evolution. Experimental support may also be provided to support the computational analyses.
In Silico Biology aims to advance the knowledge of the principles of organization of living systems. We strive to provide computational frameworks for understanding how observable biological properties arise from complex systems. In particular, we seek for integrative formalisms to decipher cross-talks underlying systems level properties, ultimate aim of multi-scale models.
Studies published in
In Silico Biology generally use theoretical models and computational analysis to gain quantitative insights into regulatory processes and networks, cell physiology and morphology, tissue dynamics and organ systems. Special areas of interest include signal transduction and information processing, gene expression and gene regulatory networks, metabolism, proliferation, differentiation and morphogenesis, among others, and the use of multi-scale modeling to connect molecular and cellular systems to the level of organisms and populations.
In Silico Biology also publishes foundational research in which novel algorithms are developed to facilitate modeling and simulations. Such research must demonstrate application to a concrete biological problem.
In Silico Biology frequently publishes special issues on seminal topics and trends. Special issues are handled by Special Issue Editors appointed by the Editor-in-Chief. Proposals for special issues should be sent to the Editor-in-Chief.
About In Silico Biology
is a pendant to
(in the living system) and
(in the test tube) biological experiments, and implies the gain of insights by computer-based simulations and model analyses.
In Silico Biology (ISB) was founded in 1998 as a purely online journal. IOS Press became the publisher of the printed journal shortly after. Today, ISB is dedicated exclusively to biological systems modeling and multi-scale simulations and is published solely by IOS Press. The previous online publisher, Bioinformation Systems, maintains a website containing studies published between 1998 and 2010 for archival purposes.
We strongly support open communications and encourage researchers to share results and preliminary data with the community. Therefore, results and preliminary data made public through conference presentations, conference proceeding or posting of unrefereed manuscripts on preprint servers will not prohibit publication in ISB. However, authors are required to modify a preprint to include the journal reference (including DOI), and a link to the published article on the ISB website upon publication.
Abstract: In previous works we have presented and applied a method to predict the parameter profile that optimizes biochemical systems regarding either a single or a set of metabolic responses within physiological constraints [Vera et al., 2003a]. This optimization technique requires a previous model definition and a translation to S-system form and the use of widely available linear programming packages. However, in dealing with these issues the interested researcher has to confront additional difficulties because of a…lack of connectivity among available software packages or routines specifically designed to perform different tasks. In addition to this difficulty is the unavailability of any automated package which is capable of performing such optimizations and the previous required analysis. This situation prompted us to develop an integrated software package able to deal with these tasks in a single program environment. In this paper we present a software package for the model definition, analysis and optimization of a biochemical system. It starts with a given model definition that is directly translated to its equivalent S-system form. Once the model quality assessment is performed (stability and sensitivity analysis) the program determines the parameter profile that yields the optimized response compatible with a predefined set of constraints. Moreover the package finds the set of solutions obtained when more than one system's responses are to be optimized (multiobjective optimization).
Keywords: Integrated Matlab package, bioprocesses, optimisation, linear programming, power-law formalism
Abstract: We analyzed an extended core promoter regions covering [-70,+60] segment relative to the transcription start site of human promoters contained in the Eukaryotic Promoter Database. The analysis was made by using the Match program ver. 1.9 with an optimized setting and the TRANSFAC Professional database ver. 7.2. This analysis revealed that the most common transcription factor binding site in the examined collection of core promoters appears to be initiator (characterized by GEN_INI),…which is expected. The other less obvious sites found were Spz1, E2F-1, ZF5, and C/EBP. The 'cap' site was also in this most common group. Over-representation of these sites relative to the non-promoter background data ranged from 0.3167 to 32.1645. These sites were characterized by being present in more than 60% of promoter sequences. Interestingly, the TATA-box has been found in only 11.63% of all examined promoters. The study is complemented by separate analyses of promoter groups having different GC content. These additional analyses revealed that the most common promoter elements found also include AP-2, CdxA, Pax-2, SRY, STAT1 and STAT5A. It was also observed that a number of promoter elements show strong preference either for the GC-rich or the GC-poor core promoters.
Abstract: waveTM is a web tool for the prediction of transmembrane segments in α-helical membrane proteins. Prediction is performed by a dynamic programming algorithm on wavelet-denoised 'hydropathy' signals. Users submit a protein sequence and receive interactively the results. Topology prediction can also be obtained in conjunction with the algorithm OrienTM. A web server that implements the waveTM algorithm is freely available at http://bioinformatics.biol.uoa.gr/waveTM.
Abstract: We present an algorithm to detect protein sub-structural motifs from primary sequence. The input to the algorithm is a set of aligned multiple protein sequences. It uses wavelet transforms to decompose protein sequences represented numerically by different indices (such as polarity, accessible surface area or electron-ion integration potentials of the amino acids). The numerical representation of a protein sequence has significant correlation with its biological activity, thus common motifs are expected to be observable from…the wavelet spectrum. The decomposed signals are then up-sampled and similarity search techniques are used to identify similar regions across all the proteins at multiple scales. Results indicate that wavelet transform techniques are a promising approach for rapid motif detection.
Keywords: protein motif detection, wavelet analysis, conserved motifs
Abstract: The genes having similar expression profiles are considered to have common regulatory mechanisms and are controlled by the binding of transcription factors to the regulatory elements present in their upstream regions. The detection of cis-regulatory elements can help in further understanding of co-expression of genes. This paper deals with the detection of motifs in the upstream regions of genes involved in diurnal rhythms of Arabidopsis and also deals with the correlation of expression data with sequence…information. We detected motifs in the upstream regions of genes involved in diurnal cycles and checked for their presence in circadian regulated, dark induced and in light induced genes of Arabidopsis. Ten motifs were reported in this study, out of which five were already reported in available transcription factor databases as the elements involved in light responsiveness. Significance study of ten motifs was done by taking random sets of same data size. One of the ten motifs namely GGCCCA, which was found without any base variations in 62 genes, was further studied by analyzing the expression profiles of its respective genes within the set of diurnal regulated genes using SOM clustering method. It was found that the genes were clustered together into two major groups, out of which one group had glycine rich proteins and the second group had genes belonging to dehydrogenase and oxidoreductase family.
Abstract: The dynamic range of metabolic models can be extended to deal with large perturbations by introducing the related concepts of "generalized" kinetic order and "canonical" sensitivities. Generalized kinetic orders are built as a well-defined non linear combination of the canonical sensitivities coefficients, which in turn are obtained by a least-squares regression on central composite factorial design data. In a such way, the whole domain of the operating variables is mapped without need to determine locally neither…the first nor the second order model derivatives. The method was validated through numerical simulations, its predictions being compared with those coming from a Michaelis-Menten formalism taken as reference. In parallel, two variants of the Power-law formalism (S-system, least-squares GMA) also were tested. The canonical sensitivities method produced the widest range to predict metabolite concentrations and metabolic fluxes at the steady states. In addition, the variation pattern for the logarithmic gains and for the characteristic eigenvalues have been accurately determined from a unique overall model, being both required to make realistic analysis in metabolic engineering. The achieved information also can be expressed in terms of those typical coefficients derived from the Metabolic Control Analysis (MCA). Even if current first order Power-law or MCA formalisms were used, the canonical sensitivities approach provides a significant advantage, since complete sets of homologous, accurate, locally valid metabolic coefficients can be simultaneously recovered from the array proposed, being representative of the whole range of the operating variables instead of a unique nominal condition as is usual.
Abstract: Multifactor Dimensionality Reduction (MDR) is a method for the classification and prediction of discrete clinical endpoints using attributes constructed from multilocus genotype data. Empirical studies with both real and simulated data suggest that MDR has good power for detecting gene-gene interactions in the absence of independent main effects. The purpose of this study is to develop an objective, theory-driven approach to evaluate the strengths and limitations of MDR. To accomplish this goal, we borrow concepts…from ideal observer analysis used in visual perception to evaluate the theoretical limits of classifying and predicting discrete clinical endpoints using multilocus genotype data. We conclude that MDR ideally discriminates between low risk and high risk subjects using attributes constructed from multilocus genotype data. We also show that the classification approach used once a multilocus attribute is constructed is similar to that of a naïve Bayes classifier. This study prov ides a theoretical foundation for the continued development, evaluation, and application of MDR as a data mining tool in the domain of statistical genetics and genetic epidemiology.
Abstract: Alternative splicing can yield manifold different mature mRNAs from one precursor. New findings indicate that alternative splicing occurs much more often than previously assumed. A major goal of functional genomics lies in elucidating and characterizing the entire spectrum of alternative splice forms. Existing approaches such as EST-alignments focus only on the mRNA sequence to detect alternative splice forms. They do not consider function and characteristics of the resulting proteins. One important example of such…functional characterization is homology to a known protein domain family. A powerful description of protein domains are profile Hidden Markov models (HMM) as stored in the Pfam database. In this paper we address the problem of identifying the splice form with the highest similarity to a protein domain family. Therefore, we take into consideration all possible splice forms. As demonstrated here for a number of genes, this homology based approach can be used successfully for predicting partial gene structures. Furthermore, we present some novel splice form predictions with high-scoring protein domain homology and point out that the detection of splice form specific protein domains helps to answer questions concerning hereditary diseases. Simple approaches based on a BLASTP search cannot be applied here, since the number of possible splice forms increases exponentially with the number of exons. To this end, we have developed an efficient polynomial-time algorithm, called ASFPred (Alternative Splice Form Prediction). This algorithm needs only a set of exons as input.
Keywords: alternative splicing, novel splice forms, pfam, protein domain, viterbi algorithm, profile HMM, gene prediction
Abstract: Discerning significant relationships in small data sets remains challenging. We introduce here the Hamming distance matrix and show that it is a quantitative classifier of similarities among short time-series. Its elements are derived by computing a modified form of the Hamming distance of pairs of symbol sequences obtained from the original data sets. The values from the Hamming distance matrix are then amenable to statistical analysis. Examples from stem cell research are presented to illustrate different…aspects of the method. The approach is likely to have applications in many fields.
Abstract: With the complete sequencing of multiple genomes, there have been extensions in the methods of sequence analysis from single gene/protein-based to analyzing multiple genes and proteins simultaneously. Therefore, there is a demand for user-friendly software tools that will allow mining of these enormous datasets. PPD is a WWW-based database for comparative analysis of protein lengths in completely sequenced prokaryotic and eukaryotic genomes. PPD's core objective is to create protein classification tables based on the…lengths of proteins by specifying a set of organisms and parameters. The interface can also generate information on changes in proteins of specific length distributions. This feature is of importance when the user's interest is focused on some evolutionarily related organisms or on organisms with similar or related tissue specificity or life-style. PPD is available at: PPD Home.
Abstract: Generally, there is a trade-off between methods of gene expression analysis that are precise but labor-intensive, e.g. RT-PCR, and methods that scale up to global coverage but are not quite as quantitative, e.g. microarrays. In the present paper, we show how how a known method of gene expression profiling (K. Kato, Nucleic Acids Res. 23, 3685–3690 (1995)), which relies on a fairly small number of steps, can be turned into a global gene expression measurement by…advanced data post-processing, with potentially little loss of accuracy. Post-processing here entails solving an ancillary combinatorial optimization problem. Validation is performed on silico experiments generated from the FANTOM data base of full-length mouse cDNA. We present two variants of the method. One uses state-of-the-art commercial software for solving problems of this kind, the other a code developed by us specifically for this purpose, released in the public domain under GPL license.
Keywords: global gene expression, combinatorial optimization