Authors: Hughes-Oliver, Jacqueline M. | Brooks, Atina D. | Welch, William J. | Khaledi, Morteza G. | Hawkins, Douglas | Young, S. Stanley | Patil, Kirtesh | Howell, Gary W. | Ng, Raymond T. | Chu, Moody T.
Article Type:
Research Article
Abstract:
ChemModLab, written by the ECCR @ NCSU consortium under NIH support, is a toolbox for fitting and assessing quantitative structure-activity relationships (QSARs). Its elements are: a cheminformatic front end used to supply molecular descriptors for use in modeling; a set of methods for fitting models; and methods for validating the resulting model. Compounds may be input as structures from which standard descriptors will be calculated using the freely available cheminformatic front end PowerMV; PowerMV also supports
…compound visualization. In addition, the user can directly input their own choices of descriptors, so the capability for comparing descriptors is effectively unlimited. The statistical methodologies comprise a comprehensive collection of approaches whose validity and utility have been accepted by experts in the fields. As far as possible, these tools are implemented in open-source software linked into the flexible R platform, giving the user the capability of applying many different QSAR modeling methods in a seamless way. As promising new QSAR methodologies emerge from the statistical and data-mining communities, they will be incorporated in the laboratory. The web site also incorporates links to public-domain data sets that can be used as test cases for proposed new modeling methods. The capabilities of ChemModLab are illustrated using a variety of biological responses, with different modeling methodologies being applied to each. These show clear differences in quality of the fitted QSAR model, and in computational requirements. The laboratory is web-based, and use is free. Researchers with new assay data, a new descriptor set, or a new modeling method may readily build QSAR models and benchmark their results against other findings. Users may also examine the diversity of the molecules identified by a QSAR model. Moreover, users have the choice of placing their data sets in a public area to facilitate communication with other researchers; or can keep them hidden to preserve confidentiality.
Show more
Keywords: Cheminformatics, data-mining, ensemble methods, model assessment, model validation, nearest neighbors, neural networks, QSAR, recursive partitioning, regression, support vector machine, virtual screening
DOI: 10.3233/CI-2008-0016
Citation: In Silico Biology,
vol. 11, no. 1-2, pp. 61-81, 2012
Price: EUR 27.50