Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Feng, Lizhoua; b | Wang, Youweia; b | Zuo, Wanlia; b; *
Affiliations: [a] College of Computer Science and Technology, Jilin University, Changchun, Jilin, China | [b] Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, Jilin University, Changchun, Jilin, China
Correspondence: [*] Corresponding author. Wanli Zuo, College of Computer Science and Technology, Jilin University, Changchun, China. Tel.: +86 13604307340; Fax: +86 0431 85166492; E-mail: [email protected].
Abstract: In order to improve the classification speed without sacrificing the email classification accuracy seriously, a novel online spam classification method is proposed. Firstly, the conceptions of term frequency based interest sets are introduced, and emails are classified by combining term frequency based interest sets and Naïve Bayes classifier. Secondly, based on the active learning theory, a novel boundary density based email classification certainty evaluating method is proposed to select and recommend emails to users for labeling by combining the user interests. Finally, the emails which are labeled and classified with the greatest possibilities are used for retraining based on the incremental learning theory. In the experiments, Support Vector Machine (SVM), Naïve Bayesian (NB) and K-Nearest Neighbors (KNN) classifiers are used on two corpuses: Trec2007 and Enron-spam. Comparing with six typical active learning based incremental learning methods, the proposed method greatly reduces the consuming time of email classification while guaranteeing the accuracy. Moreover, the proposed method brings very small sample labeling burden to the users, proving its high value on online application.
Keywords: A mail classification, term frequency based interest sets, Support Vector Machine, Naïve Bayesian, K-NearestNeighbors, active learning, incremental learning
DOI: 10.3233/IFS-151707
Journal: Journal of Intelligent & Fuzzy Systems, vol. 30, no. 1, pp. 17-27, 2016
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]