Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Li, Jundonga; 1; * | Zaiane, Osmar R.b
Affiliations: [a] Computer Science and Engineering, Arizona State University, Tempe, AZ, USA | [b] Department of Computing Science, University of Alberta, Edmonton, AB, Canada
Correspondence: [*] Corresponding author: Jundong Li, Computer Science and Engineering, Arizona State University, Tempe, AZ, USA. E-mail: [email protected].
Note: [1] The work was done when the author was at University of Alberta.
Abstract: Established associative classification algorithms have shown to be very effective in handling categorical data such as text data. The learned model is a set of rules that are easy to understand and can be edited. However, they still suffer from the following limitations: first, they mostly use the support-confidence framework to mine classification association rules which require the setting of some confounding parameters; second, the lack of statistical dependency in the used framework may lead to the omission of many interesting rules and the detection of meaningless rules; third, the rule generation process usually generates a sheer number of rules which puts in question the interpretability and readability of the learned associative classification model. In this paper, we propose a novel associative classifier, SigDirect, to address the above problems. In particular, we use Fisher’s exact test as a significance measure to directly mine classification association rules by some effective pruning strategies. Without any threshold settings like minimum support and minimum confidence, SigDirect is able to find non-redundant classification association rules which express a statistically significant dependency between a set of antecedent items and a consequent class label. To further reduce the number of noisy rules, we present an instance-centric rule pruning strategy to find a subset of rules of high quality. At last, we propose and investigate various rule classification strategies to achieve a more accurate classification model. Experimental results on real-world datasets show that SigDirect achieves better performance in terms of classification accuracy when measured with state-of-the-art rule based and associative classifiers. Furthermore, the number of rules generated by SigDirect is orders of magnitude smaller than the number of rules found by other associative classifiers, which is very appealing in practice.
Keywords: Associative classification, rules, statistical significance
DOI: 10.3233/IDA-163141
Journal: Intelligent Data Analysis, vol. 21, no. 5, pp. 1155-1172, 2017
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]