Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Zhao, Jiakun | Jin, Ju* | Zhang, Yibo | Zhang, Ruifeng | Chen, Si
Affiliations: School of Software Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
Correspondence: [*] Corresponding author: Ju Jin, School of Software Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China. E-mail: [email protected].
Abstract: The imbalanced data problem is widespread in the real world. In the process of training machine learning models, ignoring imbalanced data problems will cause the performance of the model to deteriorate. At present, researchers have proposed many methods to deal with the imbalanced data problems, but these methods mainly focus on the imbalanced data problems in two-class classification tasks. Learning from multi-class imbalanced data sets is still an open problem. In this paper, an ensemble method for classifying multi-class imbalanced data sets is put forward, called multi-class WHMBoost. It is an extension of WHMBoost that we proposed earlier. We do not use the algorithm used in WHMBoost to process the data, but use random balance based on average size so as to balance the data distribution. The weak classifiers we use in the boosting algorithm are support vector machine and decision tree classifier. In the process of training the model, they participate in training with given weights in order to complement each other’s advantages. On 18 multi-class imbalanced data sets, we compared the performance of multi-class WHMBoost with state of the art ensemble algorithms using MAUC, MG-mean and MMCC as evaluation criteria. The results demonstrate that it has obvious advantages compared with state of the art ensemble algorithms and can effectively deal with multi-class imbalanced data sets.
Keywords: multi-class, imbalanced data, ensemble method, random balance based on average size
DOI: 10.3233/IDA-215874
Journal: Intelligent Data Analysis, vol. 26, no. 3, pp. 599-614, 2022
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]