Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Shekhar, Shashia; * | Sharma, Dilip Kumara | Sufyan Beg, M.M.b
Affiliations: [a] Department of Computer Engineering and Applications, GLA University, Mathura 281406, India | [b] Department of Computer Engineering, Aligarh Muslim University, Aligarh 202002, India
Correspondence: [*] Corresponding author: Shashi Shekhar, Department of Computer Engineering and Applications, GLA University, Mathura 281406, India. E-mail: [email protected].
Abstract: The language used by the users in social media nowadays is Code-mixed text, i.e., mixing of two or more languages. This paper describes the application of the code mixed index in Indian social media texts and comparing the complexity to identify language at word level using Bi-directional Long Short Term Memory model. Social media platforms are now widely used by people to express their opinion and interest. The major contribution of the work is to propose a technique for identifying the language of Hindi-English code-mixed data used in three social media platforms namely, Facebook, Twitter, and WhatsApp. We recommend a deep learning framework based on cBoW and Skip gram model that predicts the origin of the word from language perspective in the sequence based on the specific words that have come before it in the sequence. The context capture module of the system gives better accuracy for word embedding model as compared to character embedding.
Keywords: Language identification, transliteration, character embedding, word embedding, Natural Language Processing, cBoW, skip-gram
DOI: 10.3233/KES-190409
Journal: International Journal of Knowledge-based and Intelligent Engineering Systems, vol. 23, no. 3, pp. 167-179, 2019
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]