Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Mukhin, A.V.a | Kilbas, I.A.a | Paringer, R.A.a; b | Ilyasova, N. Yu.a; b | Kupriyanov, A.V.a; b; *
Affiliations: [a] Samara National Research University, Moskovskoye Shosse, 34, Samara, Russia | [b] IPSI RAS – Branch of the FSRC “Crystallography and Photonics” RAS, Samara, Russia
Correspondence: [*] Corresponding author: A.V. Kupriyanov, Samara National Research University, Moskovskoye Shosse, 34, Samara 443086, Russia. E-mail: [email protected].
Abstract: In this paper, we propose a data balancing method for multi-label biomedical data. The method can be applied in the case of semantic segmentation problems for balancing the corresponding image data. The proposed method performs oversampling of instances of minority classes in a way that increases the frequencies of appearance (a ratio of number of samples, containing this class, over the total number of samples in the dataset) of minority classes in the data, thereby reducing the class imbalance. The effectiveness of the proposed method is shown experimentally by applying it to two highly unbalanced biomedical image datasets. A convolutional neural network (CNN) was trained on several versions of those datasets: one balanced with the proposed method, another balanced with manual oversampling and an unbalanced version. The results of the experiments validate the effectiveness of the proposed method, proving that it allows the influence of class imbalance on the learning algorithm to be reduced, thus improving its original classification results for most of the classes. Apart from biomedical image data, the proposed method was applied to several common multi-label datasets. Inherently, the proposed method does not make any assumptions about the underlying structure of the data to be balanced; therefore, it can be applied to all types of data (vectors, images, etc.) that can be described in a multi-label framework. It also can be used in conjunction with any learning algorithm that is suitable for multi-label data. To illustrate its wider applicability, a series of experiments was conducted using seven common multi-label datasets. An experimental comparison to existing multi-label data balancing approaches is provided, as well. The experimental results show that the proposed method presents a competitive alternative to existing approaches.
Keywords: Multi-label data, multi-label balancing, imbalanced data, neural networks, convolutional network, fundus, biomedical data, semantic segmentation
DOI: 10.3233/ICA-220676
Journal: Integrated Computer-Aided Engineering, vol. 29, no. 2, pp. 209-225, 2022
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]