Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Diao, Xiu-Li | Zhang, Hao-Ran | Zeng, Qing-Tian | Song, Zheng-Guo; * | Zhao, Hua
Affiliations: College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
Correspondence: [*] Corresponding author. Zheng-Guo Song, College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, China. E-mail: [email protected].
Abstract: At present, the Chinese text field is facing challenges from low resource issues such as data scarcity and annotation difficulties. Moreover, in the domain of cigarette tasting, cigarette tasting texts tend to be colloquial, making it difficult to obtain valuable and high-quality tasting texts. Therefore, in this paper, we construct a cigarette tasting dataset (CT2023) and propose a novel Chinese text classification method based on ERNIE and Comparative Learning for Low-Resource scenarios (ECLLR). Firstly, to address the issues of limited vocabulary diversity and sparse features in cigarette tasting text, we utilize Term Frequency-Inverse Document Frequency (TF-IDF) to extract key terms, supplementing the discriminative features of the original text. Secondly, ERNIE is employed to obtain sentence-level vector embedding of the text. Finally, contrastive learning model is used to further refine the text after fusing the keyword features, thereby enhancing the performance of the proposed text classification model. Experiments on the CT2023 dataset demonstrate an accuracy rate of 96.33% for the proposed method, surpassing the baseline model by at least 11 percentage points, and showing good text classification performance. It is thus clear that the proposed approach can effectively provide recommendations and decision support for cigarette production processes in tobacco companies.
Keywords: Low-resource, Cigarette Tasting, Contrastive Learning, Text classification
DOI: 10.3233/JIFS-237816
Journal: Journal of Intelligent & Fuzzy Systems, vol. Pre-press, no. Pre-press, pp. 1-15, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]