Improving sentence representation for vietnamese natural language understanding using optimal transport

Nguyen, Phu Xuan-Vinh; Nguyen, Thu Hoang-Thien; Van Nguyen, Kiet; Nguyen, Ngan Luu-Thuy

doi:10.3233/JIFS-231485

Improving sentence representation for vietnamese natural language understanding using optimal transport

Article type: Research Article

Authors: Nguyen, Phu Xuan-Vinh^a | Nguyen, Thu Hoang-Thien^b | Van Nguyen, Kiet^a | Nguyen, Ngan Luu-Thuy^{a; *}

Affiliations: [a] University of Information Technology, Vietnam National University, Ho Chi Minh City, Vietnam | [b] International University, Vietnam National University, Ho Chi Minh City, Vietnam

Correspondence: [*] Corresponding author. Ngan Luu-Thuy Nguyen, University of Information Technology, Vietnam National University, Ho Chi Minh City, Vietnam. E-mail: [email protected].

Abstract: Multilingual pre-trained language models have achieved impressive results on most natural language processing tasks. However, the performance is inhibited due to capacity limitations and their under-representation of pre-training data, especially for languages with limited resources. This has led to the creation of tailored pre-trained language models, in which the models are pre-trained on large amounts of monolingual data or domain specific corpus. Nevertheless, compared to relying on multiple monolingual models, utilizing multilingual models offers the advantage of multilinguality, such as generalization on cross-lingual resources. To combine the advantages of both multilingual and monolingual models, we propose KDDA - a framework that leverages monolingual models to a single multilingual model with the aim to improve sentence representation for Vietnamese. KDDA employs teacher-student framework and cross-lingual transfer that aims to adopt knowledge from two monolingual models (teachers) and transfers them into a unified multilingual model (student). Since the representations from the teachers and the student lie on disparate semantic spaces, we measure discrepancy between their distributions by using Sinkhorn Divergence - an optimal transport distance. We conduct experiments on two Vietnamese natural language understanding tasks, including machine reading comprehension and natural language inference. Experimental results show that our model outperforms other state-of-the-art models and yields competitive performances.

Keywords: Natural language understanding, machine reading comprehension, natural language inference, knowledge distillation, optimal transport

DOI: 10.3233/JIFS-231485

Journal: Journal of Intelligent & Fuzzy Systems, vol. 45, no. 6, pp. 9277-9288, 2023

Published: 02 December 2023

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia