Hybrid unstructured text features for meta-heuristic assisted deep CNN-based hierarchical clustering

Jyothi, Bankapalli; Sumalatha, L.; Eluri, Suneetha

doi:10.3233/IDT-220201

Hybrid unstructured text features for meta-heuristic assisted deep CNN-based hierarchical clustering

Article type: Research Article

Authors: Jyothi, Bankapalli^{a; *} | Sumalatha, L.^b | Eluri, Suneetha^a

Affiliations: [a] Computer Science and Engineering, JNTUK Kakinada, Kakinada, Andhra Pradesh, India | [b] Computer Science and Engineering, Jawaharlal Nehru Technological University, Hyderabad, Telangana, India

Correspondence: [*] Corresponding author: Bankapalli Jyothi, Research Scholar, Computer science and Engineering, JNTUK Kakinada, Kakinada, Andhra Pradesh, India. E-mail: [email protected].

Abstract: The text clustering model becomes an essential process to sort the unstructured text data in an appropriate format. But, it does not give the pave for extracting the information to facilitate the document representation. In today’s date, it becomes crucial to retrieve the relevant text data. Mostly, the data comprises an unstructured text format that it is difficult to categorize the data. The major intention of this work is to implement a new text clustering model of unstructured data using classifier approaches. At first, the unstructured data is taken from standard benchmark datasets focusing on both English and Telugu languages. The collected text data is then given to the pre-processing stage. The pre-processed data is fed into the model of the feature extraction stage 1, in which the GloVe embedding technique is used for extracting text features. Similarly, in the feature extraction stage 2, the pre-processed data is used to extract the deep text features using Text Convolutional Neural Network (Text CNN). Then, the text features from Stage 1 and deep features from Stage 2 are all together and employed for optimal feature selection using the Hybrid Sea Lion Grasshopper Optimization (HSLnGO), where the traditional SLnO is superimposed with GOA. Finally, the text clustering is processed with the help of Deep CNN-assisted hierarchical clustering, where the parameter optimization is done to improve the clustering performance using HSLnGO. Thus, the simulation findings illustrate that the framework yields impressive performance of text classification in contrast with other techniques while implementing the unstructured text data using different quantitative measures.

Keywords: Unstructured data, text clustering, feature extraction, optimal feature selection, deep CNN-based hierarchical clustering, hybrid sea lion grasshopper optimization

DOI: 10.3233/IDT-220201

Journal: Intelligent Decision Technologies, vol. 17, no. 4, pp. 1323-1350, 2023

Published: 20 November 2023

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia