Adversarial training for few-shot text classification

Croce, Danilo; Castellucci, Giuseppe; Basili, Roberto

doi:10.3233/IA-200051

Adversarial training for few-shot text classification

Issue title: Selected and revised papers from the 18th International Conference of the Italian Association for Artificial Intelligence

Guest editors: Mario Alviano, Gianluigi Greco and Francesco Scarcello

Article type: Research Article

Authors: Croce, Danilo^{; *} | Castellucci, Giuseppe | Basili, Roberto

Affiliations: University of Roma Tor Vergata, Department of Enterprise Engineering, Roma RM, Italy

Correspondence: [*] Corresponding author: Danilo Croce, University of Roma Tor Vergata, Department of Enterprise Engineering. E-mail: [email protected].

Abstract: In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.

Keywords: Semi-supervised learning, generative adversarial network, kernel-based embedding spaces, universal sentence encoding

DOI: 10.3233/IA-200051

Journal: Intelligenza Artificiale, vol. 14, no. 2, pp. 201-214, 2020

Published: 11 January 2021

Price: EUR 27.50

Adversarial training for few-shot text classification

North America

Europe

Asia

Share this:

North America

Europe

Asia