Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Special Section: Soft Computing and Intelligent Systems: Techniques and Applications
Guest editors: Sabu M. Thampi, El-Sayed M. El-Alfy, Sushmita Mitra and Ljiljana Trajkovic
Article type: Research Article
Authors: Remmiya Devi, G.; * | Anand Kumar, M. | Soman, K.P.
Affiliations: Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, Tamil Nadu, India
Correspondence: [*] Corresponding author. G. Remmiya Devi, Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, Tamil Nadu, India. E-mail: [email protected].
Abstract: Social media is considered to be a vibrant area where millions of individuals interact and share their views. Processing social media text in Indian languages is a challenging task, as it is a well-known fact that Indian languages are morphologically rich in structure. On transferring such an unstructured text into a consistent format, the data is exposed to feature extraction method. In the huge corpora, information units i.e. entities holds the basic idea of the content. The main aim of the system is to recognise and extract the named entities in the social media twitter text. The proposed system relies on the proficient co-occurrence based word embedding models to extract the features for the words in the dataset. The proposed work makes use of text data from the Twitter resource in the Tamil language. In order to enhance the performance of the system, tri-gram features are extracted from the word embedding vectors. Hence, systems are trained using N-gram embedding features and named entity tags. Implementation of the system is using machine learning classifier, Support Vector Machine (SVM). On comparing the performance of the proposed systems, it can be seen that glove embedding shows better results with the accuracy of 96.93%, whereas the accuracy of word2vec embedding is 84.53%. The improvement in the performance of the system based on glove embedding with regard to the accuracy may be due to the imperative role of the co-occurrence information of glove embedding in recognising the entities.
Keywords: Support Vector Machine, Word2vec, glove embedding, N-gram embedding, structured skip gram
DOI: 10.3233/JIFS-169439
Journal: Journal of Intelligent & Fuzzy Systems, vol. 34, no. 3, pp. 1435-1442, 2018
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]