Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Soft Computing Applications
Guest editors: Valentina Emilia Balas
Article type: Research Article
Authors: Mundra, Shikhaa | Vijay, Shounaka | Mundra, Ankitb; * | Gupta, Punitc | Goyal, Mayank Kumard | Kaur, Mandeepd | Khaitan, Supriyae | Rajpoot, Abha Kirand
Affiliations: [a] Department of Computer Science and Engineering, Manipal University Jaipur, Jaipur, India | [b] Department of Information Technology, Manipal University Jaipur, Jaipur, India | [c] Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India | [d] Department of Computer Science & Engineering, School of Engineering & Technology, Sharda University, Greater Noida, India | [e] Department of Computer Engineering, Pillai College of Engineering, Navi Mumbai, India
Correspondence: [*] Corresponding author. Ankit Mundra, Department of Information Technology, Manipal University Jaipur, Jaipur, India. E-mail: [email protected].
Abstract: Thousands of patients around the world affecting their health with various factor as age, body mass index, cholesterol levels, albumin levels and several other factor. Prediction of health outcome due to these factors at a proper time can be served as an early warning. Recent growth in machine learning algorithm inspired us to build a predictive model for better healthcare facilities. In our work we have focused on problem of noisy and imbalanced dataset in which majority class is favored over minority one that leads to false prediction. We have experimented with two publicly available medical imbalanced dataset which varies in its size as MIT’s GOSSIS death and PIMA Indians Diabetes Dataset based on binary class. In this model we have investigated 3 oversampling techniques (Synthetic Minority Oversampler, Random Oversampler and Adaptive Synthetic Sampler) along with two undersampling techniques (Random Undersampler and Near Miss) which were paired with 3 data reduction and cleaning methods namely Tomek Links, One Sided Selection and Edited Nearest Neighbors. At last, we found that combination of Adaptive Synthetic Sampler along with One Sided Selection perform better in case of large size dataset while combination of random oversampler along with Tomek Link showed better performance in case of low size data dataset. We have also analyzed that oversampling technique gives quite promising results in comparison to undersampling methods specifically when applied with machine learning classifiers as these classifiers are data hungry algorithms.
Keywords: Synthetic Minority Oversampler (SMOTE), Random Oversampler (ROS), Adaptive Synthetic Sampler (ADASYN), Random Undersampler (RUS), near miss, Tomek Link (TL), One Sided Selection (OSS), Edited Nearest Neighbors (ENN)
DOI: 10.3233/JIFS-219294
Journal: Journal of Intelligent & Fuzzy Systems, vol. 43, no. 2, pp. 1933-1946, 2022
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]