Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: He, Zhiyanga; * | Wu, Jia | Lv, Pingb
Affiliations: [a] Department of Electronic Engineering, Tsinghua University, Beijing, China | [b] Tsinghua-iFlytek Joint Laboratory for Speech Technology, Beijing, China
Correspondence: [*] Corresponding author: Zhiyang He, Department of Electronic Engineering, Tsinghua University, Beijing, China. E-mail: [email protected].
Abstract: In the current paper, we propose a probabilistic generative model, the label correlation mixture model (LCMM), to depict multi-labeled document data, which can be utilized for multi-label text classification. LCMM assumes two stochastic generative processes, which correspond to two submodels: 1) a label correlation model; and 2) a label mixture model. The former model formulates labels’ generative process, in which a label correlation network is created to depict the dependency between labels. Moreover, we present an efficient inference algorithm for calculating the generative probability of a multi-label class. Furthermore, in order to optimize the label correlation network, we propose a parameter-learning algorithm based on gradient descent. The second submodel in the LCMM depicts the generative process of words in a document with the given labels. Different traditional mixture models can be adopted in this generative process, such as the mixture of language models, or topic models. In the multi-label classification stage, we propose a two-step strategy to most efficiently utilize the LCMM based on the framework of Bayes decision theory. We conduct extensive multi-label classification experiments on three standard text data sets. The experimental results show significant performance improvements comparing to existing approaches. For example, the improvements on accuracy and macro F-score measures in the OHSUMED data set achieve 28.3% and 37.0%, respectively. These performance enhancements demonstrate the effectiveness of the proposed models and solutions.
Keywords: Label correlation mixture model, probabilistic generative model, multi-label text classification, label correlation model, label correlation network, Bayes decision theory
DOI: 10.3233/IDA-163055
Journal: Intelligent Data Analysis, vol. 21, no. 6, pp. 1371-1392, 2017
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]