Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Affiliations: [a] College of Software, Jilin University, Changchun, China | [b] Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, Changchun, China | [c] College of Computer Science and Technology, Jilin University, Changchun, China
Correspondence: [*] Corresponding author. Lu Liu. E-mail: [email protected].
Abstract: Obtaining interesting and topic-relevant information is a very important task in Web mining. Text classification using a small proportion of labeled data and a large proportion of unlabeled data, also called semi-supervised learning, is a well-known problem. Despite plenty of research on text classification, however, how to effectively and efficiently apply valuable frequent patterns and deal with high-dimensional data in text classification is still an open issue. Due to the increasing data volumes and plenty of high-dimensional data, both distance measures and time complexity could be influenced by the noisy data. This paper targets on this problem and presents a novel method for text classification called CTFP (Classification based on TFP-tree), which uses TFP-tree (Text-Frequent-Pattern-tree) to generate frequent patterns in tremendous amount of texts and conduct text classification in a relatively low dimensional data space. It effectively reduces the data dimensionality during constructing the classifier. Substantial experiments on three datasets (RCV1, SRAA and Reuters-21578) show that our proposed method can achieve better performance than many existing state-of-the-art methods on precision, efficiency and many other evaluation metrics.
Keywords: Text classification, dimensionality reduction, TFP-tree, SVM, frequent patterns
DOI: 10.3233/JIFS-171238
Journal: Journal of Intelligent & Fuzzy Systems, vol. 34, no. 3, pp. 1893-1905, 2018
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]