Purchase individual online access for 1 year to this journal.
Price: EUR N/A
ISSN 1386-6338 (P)
ISSN 1434-3207 (E)
In Silico Biology is a scientific research journal for the advancement of computational models and simulations applied to complex biological phenomena. We publish peer-reviewed leading-edge biological, biomedical and biotechnological research in which computer-based (i.e.,
"in silico"
) modeling and analysis tools are developed and utilized to predict and elucidate dynamics of biological systems, their design and control, and their evolution. Experimental support may also be provided to support the computational analyses.
In Silico Biology aims to advance the knowledge of the principles of organization of living systems. We strive to provide computational frameworks for understanding how observable biological properties arise from complex systems. In particular, we seek for integrative formalisms to decipher cross-talks underlying systems level properties, ultimate aim of multi-scale models.
Studies published in
In Silico Biology generally use theoretical models and computational analysis to gain quantitative insights into regulatory processes and networks, cell physiology and morphology, tissue dynamics and organ systems. Special areas of interest include signal transduction and information processing, gene expression and gene regulatory networks, metabolism, proliferation, differentiation and morphogenesis, among others, and the use of multi-scale modeling to connect molecular and cellular systems to the level of organisms and populations.
In Silico Biology also publishes foundational research in which novel algorithms are developed to facilitate modeling and simulations. Such research must demonstrate application to a concrete biological problem.
In Silico Biology frequently publishes special issues on seminal topics and trends. Special issues are handled by Special Issue Editors appointed by the Editor-in-Chief. Proposals for special issues should be sent to the Editor-in-Chief.
About In Silico Biology
The term
"in silico"
is a pendant to
"in vivo"
(in the living system) and
"in vitro"
(in the test tube) biological experiments, and implies the gain of insights by computer-based simulations and model analyses.
In Silico Biology (ISB) was founded in 1998 as a purely online journal. IOS Press became the publisher of the printed journal shortly after. Today, ISB is dedicated exclusively to biological systems modeling and multi-scale simulations and is published solely by IOS Press. The previous online publisher, Bioinformation Systems, maintains a website containing studies published between 1998 and 2010 for archival purposes.
We strongly support open communications and encourage researchers to share results and preliminary data with the community. Therefore, results and preliminary data made public through conference presentations, conference proceeding or posting of unrefereed manuscripts on preprint servers will not prohibit publication in ISB. However, authors are required to modify a preprint to include the journal reference (including DOI), and a link to the published article on the ISB website upon publication.
Abstract: The tissue-specific expression and differential function of the crustacean hyperglycemic hormone (CHH) in Carcinus maenas indicate an interesting evolutionary history. Previous studies have shown that CHH from the sinus gland X-organ (XO-type) has hyperglycemic activity, whereas the CHH from the pericardial organ (PO-type) neither shows hyperglycemic activity nor it inhibits Y-organ ecdysteroid synthesis. Here we examined the types of selective pressures operating on the variants of CHH in Carcinus maenas. Maximum likelihood-based…codon substitution analyses revealed that the variants of this neuropeptide in C. maenas have been subjected to positive Darwinian selection indicating adaptive evolution and functional divergence among the CHH variants leading to two unique groups (PO and XO-type). Although the average ratio of nonsynonymous to synonymous substitution (ω) for the entire coding region is 0.5096, few codon sites showed significantly higher ω (10.95). Comparison of models that incorporate positive selection (ω > 1) with models not incorporating positive selection (ω <1) at certain codon sites failed to reject (p=0) evidence of positive Darwinian selection.
Show more
Abstract: We have developed a method NTXpred for predicting neurotoxins and classifying them based on their function and origin. The dataset used in this study consists of 582 non-redundant, experimentally annotated neurotoxins obtained from Swiss-Prot. A number of modules have been developed for predicting neurotoxins using residue composition based on feed-forwarded neural network (FNN), recurrent neural network (RNN), support vector machine (SVM) and achieved maximum accuracy of 84.19%, 92.75%, 97.72% respectively. In addition,…SVM modules have been developed for classifying neurotoxins based on their source (e.g., eubacteria, cnidarians, molluscs, arthropods have been and chordate) using amino acid composition and dipeptide composition and achieved maximum overall accuracy of 78.94% and 88.07% respectively. The overall accuracy increased to 92.10%, when the evolutionary information obtained from PSI-BLAST was combined with SVM module of source classification. We have also developed SVM modules for classifying neurotoxins based on functions using amino acid, dipeptide composition and achieved overall accuracy of 83.11%, 91.10% respectively. The overall accuracy of function classification improved to 95.11%, when PSI-BLAST output was combined with SVM module. All the modules developed in this study were evaluated using five-fold cross-validation technique. The NTXpred is available at www.imtech.res.in/raghava/ntxpred/ and mirror site at http://bioinformatics.uams.edu/mirror/ntxpred.
Show more
Keywords: NTXpred, prediction of neurotoxins, Webserver, blockers of ion channels
Abstract: Many members of the AraC/XylS family transcription regulator have been proven to play a critical role in regulating bacterial virulence factors in response to environmental stress. By using the Hidden Markov Model (HMM) profile built from the alignment of a 99 amino acid conserved domain sequence of 273 AraC/XylS family transcription regulators, we detected a total of 45 AraC/XylS family transcription regulators in the genome of the Gram-negative pathogen, Burkholderia pseudomallei. Further in silico analysis of…each detected AraC/XylS family transcription regulatory protein and its neighboring genes allowed us to make a first-order guess on the role of some of these transcription regulators in regulating important virulence factors such as those involved in three type III secretion systems and biosynthesis of pyochelin, exopolysaccharide (EPS) and phospholipase C. This paper has demonstrated an efficient and systematic genome-wide scale prediction of the AraC/XylS family that can be applied to other protein families.
Show more
Abstract: BLAST and Repeat Masker Parser (BRM-Parser) is a service that provides users a unified platform for easy analysis of relatively large outputs of BLAST (Basic Local Alignment Search Tool) and RepeatMasker programs. BLAST Summary feature of BRM-Parser summarizes BLAST outputs, which can be filtered using user defined thresholds for hit length, percentage identity and E-value and can be sorted by query or subject coordinates and length of the hit. It also provides a tool that merges…BLAST hits which satisfy user-defined criteria for hit length and gap between hits. The RepeatMasker Summary feature uses the RepeatMasker alignment as an input file and calculates the frequency and proportion of mutations in copies of repeat elements, as identified by the RepeatMasker. Both features can be run through a GUI or can be executed via command line using the standalone version.
Show more
Abstract: This paper describes a method developed for predicting bacterial toxins from their amino acid sequences. All the modules, developed in this study, were trained and tested on a non-redundant dataset of 150 bacterial toxins that included 77 exotoxins and 73 endotoxins. Firstly, support vector machines (SVM) based modules were developed for predicting the bacterial toxins using amino acids and dipeptides composition and achieved an accuracy of 96.07% and 92.50%, respectively. Secondly, SVM based modules were…developed for discriminating entotoxins and exotoxins, using amino acids and dipeptides composition and achieved an accuracy of 95.71% and 92.86%, respectively. In addition, modules have been developed for classifying the exotoxins (e.g. activate adenylate cyclase, activate guanylate cyclase, neurotoxins) using hidden Markov models (HMM), PSI-BLAST and a combination of the two and achieved overall accuracy of 95.75%, 97.87% and 100%, respectively. Based on the above study, a web server called 'BTXpred' has been developed, which is available at http://www.imtech.res.in/raghava/btxpred/. Supplementary information is available at http://www.imtech.res.in/raghava/btxpred/supplementary.html.
Show more
Keywords: Bacterial toxins, exotoxins, endotoxins, BTXpred, prediction server
Abstract: The voltage-gated sodium channel (VGSC) is the target site for insecticides such as DDT and synthetic pyrethroids. A single base (A-T) change in the knock-down resistance (kdr) allele leads to an amino acid substitution at position 267 that confers the target-mediated resistance to DDT and synthetic pyrethroids in Anopheles gambiae. A theoretical model of the VGSC domain II that contains the site of mutation was constructed using the K^+ channel protein of Aeropyrum pernix…as a template. The validated model with 88.6% residues in the favored region was subjected to the CASTp program that predicted 30 pockets in the modeled domain II for ligand interaction. In the model, at position 267, leucine was manually replaced with phenylalanine. When this altered model was subjected to the CASTp program, the search results showed the same number of pockets. The docking results indicate that DDT interacts with the modeled VGSC domain II at position 275 in the presence of leucine or in the presence of phenylalanine (binding energy =−5.32 kcal/mol, −6.21 kcal/mol). It appears from the results that the mutation at position 267 has no direct influence on the interaction of DDT with the target protein. Therefore, to understand the interaction affinity of DDT with the target and influence of the mutation on the existence of active sites/pockets in relation to ligand binding, a whole VGSC model is necessary.
Show more
Abstract: To reveal the relative synonymous codon usage and base composition variation in bacteriophages, six mycobacteriophages were used as a model system here and both parameters in these phages and their host bacteria, Mycobacterium tuberculosis, have been determined and compared. As expected for GC-rich genomes, there are predominantly G and C ending codons in all 6 phages. Both N_{c} plot and correspondence analysis on relative synonymous codon usage indicate that mutation bias and translation…selection influences codon usage variation in the 6 phages. Further analysis indicates that among 6 Mycobacterium phages Che9c, Bxz1 and TM4 may be extremely virulent in nature as most of their genes have high translation efficiency. Based on our data we suggest that the genes of above three phages are expressed rapidly by host's translation machinery. The information might be used to select the extremely virulent Mycobacterium tuberculosis phages suitable for phage therapy.
Show more
Abstract: Recent work has used graphs to modelize expression data from microarray experiments, in view of partitioning the genes into clusters. In this paper, we introduce the use of a decomposition by clique separators. Our aim is to improve the classical clustering methods in two ways: first we want to allow an overlap between clusters, as this seems biologically sound, and second we want to be guided by the structure of the graph to define the number…of clusters. We test this approach with a well-known yeast database (Saccharomyces cerevisiae). Our results are good, as the expression profiles of the clusters we find are very coherent. Moreover, we are able to organize into another graph the clusters we find, and order them in a fashion which turns out to respect the chronological order defined by the the sporulation process.
Show more
Keywords: Clustering method, microarray, graph decomposition, threshold family of graphs, expression profile
Abstract: Complete genome sequences of several pathogenic bacteria have been determined, and many more such projects are currently under way. While these data potentially contain all the determinants of host-pathogen interactions and possible drug targets, computational tools for selecting suitable candidates for further experimental analyses are currently limited. Detection of bacterial genes that are non-homologous to human genes, and are essential for the survival of the pathogen represents a promising means of identifying novel drug…targets. We used a differential pathway analyses approach (based on KEGG data) to identify essential genes from Pseudomonas aeruginosa. Our approach identified 214 unique enzymes in P. aeruginosa that may be potential drug targets and can be considered for rational drug design. About 40% of these putative targets have been reported as essential by transposon mutagenesis data elsewhere. Homology model for one of the proteins (LpxC) is presented as a case study and can be explored for in silico docking with suitable inhibitors. This approach is a step towards facilitating the search for new antibiotics.
Show more
Keywords: Pseudomonas aeruginosa, Homo sapiens, comparative microbial genomics, KEGG, homology, MODELLER, LpxC, potential drug targets
Abstract: The production of high-throughput gene expression data has generated a crucial need for bioinformatics tools to generate biologically interesting hypotheses. Whereas many tools are available for extracting global patterns, less attention has been focused on local pattern discovery. We propose here an original way to discover knowledge from gene expression data by means of the so-called formal concepts which hold in derived Boolean gene expression datasets. We first encoded the over-expression properties of genes in human…cells using human SAGE data. It has given rise to a Boolean matrix from which we extracted the complete collection of formal concepts, i.e., all the largest sets of over-expressed genes associated to a largest set of biological situations in which their over-expression is observed. Complete collections of such patterns tend to be huge. Since their interpretation is a time-consuming task, we propose a new method to rapidly visualize clusters of formal concepts. This designates a reasonable number of Quasi-Synexpression-Groups (QSGs) for further analysis. The interest of our approach is illustrated using human SAGE data and interpreting one of the extracted QSGs. The assessment of its biological relevancy leads to the formulation of both previously proposed and new biological hypotheses.
Show more
Abstract: Accumulating evidence suggests that that non-coding RNAs (ncRNAs) play key roles in gene regulation and may form the basis of an inter-gene communication system. Many ncRNAs are synthesized similar to mRNAs and can be detected through screening of polyA-rich EST or cDNA libraries. We developed a computational pipeline to screen EST and genomic sequence data for those transcribed genes with limited protein coding potential and applied this pipeline to the model legume Medicago truncatula. This process…identified a set of 503 mRNA-like transcripts that appear not to encode proteins. Further computational analysis showed that many of these ncRNA candidates share structural similarities to known ncRNAs and that they clearly differ from protein coding genes and non-transcribed regions in their base and oligonucleotide compositions, as well as in aspects of secondary structure. By using a machine learning approach, we show that the distinctive ncRNA features presented in this study can be used to discriminate most ncRNAs and may thus be useful for improving ncRNA prediction. Computational analysis of EST isolation frequencies in various plant tissues showed that the expression levels and expression profiles of the putative ncRNAs and mRNAs differ – most interestingly, the putative ncRNAs are highly expressed relative to mRNAs in the root nodule tissue and conserved only in closely related plants. The work presented here constitutes the first large-scale prediction and characterization of ncRNAs in legumes, and provides a basis for further research on elucidating ncRNA function in legume genomics.
Show more
Keywords: ncRNA, mRNA-like ncRNA, EST, Medicago truncatula, model legume, SVM, feature classification
Abstract: The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage similarities between a coding sequence and a set of reference sequences. When synonymous codons for a given amino acid exist, highly expressed genes seem to prefer some of them, according to tRNA abundance and thermodynamic issues. Some authors have described CAI-based methods to derive expressivity measures for all genes in a genome, in a computational framework. Here…we present the CAIAP (CAI Analyser Package), a platform independent package of computer programs allowing the calculation of the CAI and a deep study of gene expressivity from raw gene sequences. Our approach implements and optimizes a procedure to derive the reference sequences from whole genomes and use their codon usage for CAI estimation. Moreover, a set of analysis tools are provided to perform statistical analyses and therefore to give robustness to results. Objective: Our efforts were aimed to produce an easy-to-use and fully automatic set of programs specifically designed for the analysis of gene expressivity and inter-species comparisons on a great number of genomes. Moreover, the output integrates information coming from functional annotations of genes. We are maintaining a web server storing our analyses for hundreds of genomes, allowing intergenomic comparison of data thanks to dedicated search engines. The CAIAP server is hosted at www4.unifi.it/scibio/bioinfo/caiap/html. The programs (maintained as Perl scripts) are also available for download at the same location.
Show more
Keywords: Gene expressivity, Codon Adaptation Index, intergenomic comparison, program package, web server
Abstract: Antisense oligonucleotides inactivate mRNA targets, providing a tool for post-transcriptional gene silencing and a potential novel treatment for many diseases. Reliable design of active antisense depends on better understanding of the mechanism of antisense-target RNA interaction. We have studied the correlation between activity of antisense oligodeoxynucleotides (ASO) and structural features of both antisense and target RNAs. A total of 348 ASOs with known activities and their target RNA sequences are classified into…categories according to their predicted secondary structural features. Statistical analysis showed that higher activity is more likely to happen at RNA stem-loops than at other RNA structural categories. The data suggest a weak correlation between the stability of ASO structure and activity. Remarkably, a structural fit between ASO and target seems important for antisense activity. Significantly higher antisense activity is achieved with stem-loop ASOs on stem-loop or linear RNA targets.
Show more
Abstract: Most secondary structure prediction programs do not distinguish between parallel and antiparallel β-sheets. However, such knowledge would constrain the available topologies of a protein significantly, and therefore aid existing fold recognition algorithms. For this reason, we propose a technique which, in combination with existing secondary structure programs such as PSIPRED, allows one to distinguish between parallel and antiparallel β-sheets. We propose the use of a support vector machine (SVM) procedure, BETTY, to…predict parallel and antiparallel sheets from sequence. We found that there is a strong signal difference in the sequence profiles which SVMs can efficiently extract. With strand type assignment accuracies of 90.7% and 83.3% for antiparallel and parallel strands, respectively, our method adds considerably to existing information on current 3-class secondary structure predictions. BETTY has been implemented as an online service which academic researchers can access from our website http://www.fz-juelich.de/nic/cbb/service/service.php.
Show more
Keywords: SVM, support vector machine, structure prediction, secondary structure prediction, tertiary structure prediction, beta-sheets, beta-strands, parallel beta-sheets, antiparallel beta-sheets, long range constraints
Abstract: Recent sequencing of genomes of several microorganisms provides an opportunity to have access to huge volumes of data stored in various databases. This has resulted in the development of various computational and visualization tools to aid in retrieval and analysis of data. Development of user friendly genome data mapping and visualization tools facilitates researchers to closely examine various features of genes and make inferences from the displayed data efficiently. PGV – Prokaryotic Genome Viewer is a…Java based web application tool capable of generating high quality interactive circular chromosome maps. With simple mouse roll over tasks on the interested region on the displayed map, the user is provided with features such as feature labeling, multi-fold zooming, image rotation and hyperlinking to different information resources. The tool is capable of instantaneously generating maps using user-supplied sequence data.
Show more
Abstract: Members of the genus Xanthomonas are significant phytopathogens, which cause diseases in several economically important crops including rice, canola, tomato, citrus, etc. We have analyzed the genomes of six recently sequenced Xanthomonas strains for their synonymous codon usage patterns for all of protein coding genes and specific genes associated with pathogenesis, and determined the predicted highly expressed (PHX) genes by the use of the codon adaptation index (CAI). Our results show considerable…heterogeneity among the genes of these moderately G+C rich genomes. Most of the genes were moderate to highly biased in their codon usage. However, unlike ribosomal protein genes, which were governed by translational selection, those genes associated with pathogenesis (GAP) were affected by mutational pressure and were predicted to have moderate to low expression levels. Only two out of 339 GAP genes were in the PHX category. PHX genes present in clusters of orthologous groups of proteins (COGs) were identified. Genes in the plasmids present in two strains showed moderate to low expression level and only a couple of genes featured in the PHX list. Common genes present in the top-20 PHX gene-list were identified and their possible functions are discussed. Correspondence analysis showed that genes are highly confined to a core in the plot.
Show more