Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Selected papers from IDA2005, Madrid, Spain
Article type: Research Article
Authors: Fisher, Douglas H.a; * | Edgerton, Mary E.b | Chen, Zhihuab | Tang, Lianhongc | Frey, Lewisc
Affiliations: [a] Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37235, USA | [b] Department of Interdisciplinary Oncology, H. Lee Moffitt Cancer Center and Research Institute, SRB-3, 12902 Magnolia Drive, Tampa, Fl 33612, USA | [c] Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA
Correspondence: [*] Corresponding author: Douglas H. Fisher, Department of Electrical Engineering and Computer Science, Box 1679-B, Vanderbilt University, Nashville, TN 37235, USA. E-mail: [email protected].
Abstract: Exploring the vast number of possible feature interactions in domains such as gene expression microarray data is an onerous task. We describe Backward-Chaining Rule Induction (BCRI) as a semi-supervised mechanism for biasing the search for IF-THEN rules that express plausible feature interactions. BCRI adds to a relatively limited tool-chest of hypothesis generation software and is an alternative to purely unsupervised association-rule learning. We illustrate BCRI by using it to search for gene-to-gene causal mechanisms that underlie lung cancer. Mapping hypothesized gene interactions against prior knowledge offers support and explanations for hypothesized interactions, and suggests gaps in current knowledge that induction might help fill. Our assumption is that “good” hypotheses incrementally extend/revise existing knowledge. BCRI is implemented as a wrapper around a base supervised-rule-learning algorithm. We summarize our prior work with an adaptation of C4.5 as the base algorithm (C45-BCRI), extending this in the current study to use Brute as the base algorithm (Brute-BCRI). In contrast to C4.5's greedy strategy, Brute extensively searches the rule space. Moreover, Brute returns many more rules (i.e., hypothesized feature interactions) than does C4.5. To remain an effective hypothesis-generation tool requires that Brute-BCRI more carefully rank and prune hypothesized interactions than does C45-BCRI. Prior knowledge serves to evaluate final Brute-BCRI rules just as it does with C45-BCRI, but prior knowledge also serves to evaluate and prune intermediate search states, thus maintaining a manageable number of rules for evaluation by a domain expert.
Keywords: Rule induction, hypothesis generation, rule exploration, interactive induction, iterative exploration, machine learning, data mining, prior knowledge, microarray, data analysis, molecular mechanisms, class discovery, semi-supervised learning, decision trees, Brute, non-small cell lung cancer, systems biology
DOI: 10.3233/IDA-2006-10502
Journal: Intelligent Data Analysis, vol. 10, no. 5, pp. 397-417, 2006
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]