Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Lopez de Mantaras, Ramon; | Cerquides, Jesus | Garcia, Pere
Affiliations: Artificial Intelligence Research Institute (IIIA), Spanish Council for Scientific Research (CSIC), Campus UAB, 08193 Bellaterra, Spain E‐mail: {mantaras,cerquide,pere}@iiia.csic.es
Note: [] Corresponding author: Ramon Lopez de Mantaras, IIIA‐CSIC, Campus UAB, 08193 Bellaterra, Spain. Tel.: +343580 95 70; Fax: +343580 96 61; E‐mail: [email protected].
Abstract: In [7], a new information‐theoretic attribute selection method for decision tree induction was introduced. This method consists in computing for each node, a distance between the partition generated by the values of each candidate attribute in the node and the correct partition of the subset of training examples in this node. The chosen attribute is that whose corresponding partition is the closest to the correct partition (i.e., the partition that perfectly classifies the training data). In that paper it was also formally proved that such distance is not biased towards attributes with a large number of values in the sense specified by Quinlan in [12] and some initial experimental evidence suggests that the predictive accuracy of the induced trees was not significantly different from that obtained with the most widely used information theoretic attribute selection measures, that is, Quinlan’s Gain and Quinlan’s Gain Ratio. However, it seemed that the distance induced smaller trees especially when the attributes had different number of values. In that paper it was not confirmed that the differences were statistically significant due to the small number of experiments performed. In this paper we report experimental results that allow to confirm that the distance induces trees whose size, without losing accuracy, is not significantly different from those obtained using Quinlan’s Gain but smaller than those obtained with Quinlan’s Gain Ratio. These experimental results are supported by a statistical analysis performed using two statistical hypothesis tests: the sign test and the signed rank test.
Keywords: Decision trees, attribute selection, statistical tests, experimental methodology
Journal: AI Communications, vol. 11, no. 2, pp. 91-100, 1998
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]