Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Ahmad, Siti Rohaidaha; * | Bakar, Azuraliza Abub | Yaakub, Mohd Ridzwanb
Affiliations: [a] Department of Computer Science, Faculty of Defence Science and Technology, Universiti Pertahanan Nasional Malaysia, Kuala Lumpur 57000, Malaysia | [b] Data Mining and Optimization Research Group, Centre for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi Selangor 46000, Malaysia
Correspondence: [*] Corresponding author: Siti Rohaidah Ahmad, Department of Computer Science, Faculty of Defence Science and Technology, Universiti Pertahanan Nasional Malaysia, Kuala Lumpur 57000, Malaysia. Tel.: +60 390513400; E-mail: [email protected].
Abstract: In sentiment analysis, the high dimensionality of the feature vector is a key problem because it can decrease the accuracy of sentiment classification and make it difficult to obtain the optimum subset of features. To solve this problem, this study proposes a new text feature selection method that uses a wrapper approach, integrated with ant colony optimization (ACO) to guide the feature selection process. It also uses the k-nearest neighbour (KNN) as a classifier to evaluate and generate a candidate subset of optimum features. To test the subset of optimum features, algorithm dependency relations were used to find the relationship between the feature and the sentiment word in customer reviews. The output of the feature subset, which was derived using the proposed ACO-KNN algorithm, was used as an input to identify and extract sentiment words from sentences in customer reviews. The resulting relationship between features and sentiment words was tested and evaluated to determine the accuracy based on precision, recall, and F-score. The performance of the proposed ACO-KNN algorithm on customer review datasets was evaluated and compared with that of two hybrid algorithms from the literature, namely, the genetic algorithm with information gain and information gain with rough set attribute reduction. The results of the experiments showed that the proposed ACO-KNN algorithm was able to obtain the optimum subset of features and can improve the accuracy of sentiment classification.
Keywords: Sentiment analysis, metaheuristic algorithm, ant colony optimization, k-nearest neighbour, text feature selection
DOI: 10.3233/IDA-173740
Journal: Intelligent Data Analysis, vol. 23, no. 1, pp. 133-158, 2019
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]