Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Abidi, Mustufa Haidera; * | Khare, Neelub | D., Preethic | Alkhalefah, Hishama | Umer, Usamaa
Affiliations: [a] Advanced Manufacturing Institute, King Saud University, Riyadh, Saudi Arabia | [b] School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, Tamil Nadu, India | [c] Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Ramapuram, Chennai, India
Correspondence: [*] Corresponding author: Mustufa Haider Abidi. Advanced Manufacturing Institute, King Saud University, Riyadh, Saudi Arabia. E-mail: [email protected].
Abstract: The emergence of the novel COVID-19 virus has had a profound impact on global healthcare systems and economies, underscoring the imperative need for the development of precise and expeditious diagnostic tools. Machine learning techniques have emerged as a promising avenue for augmenting the capabilities of medical professionals in disease diagnosis and classification. In this research, the EFS-XGBoost classifier model, a robust approach for the classification of patients afflicted with COVID-19 is proposed. The key innovation in the proposed model lies in the Ensemble-based Feature Selection (EFS) strategy, which enables the judicious selection of relevant features from the expansive COVID-19 dataset. Subsequently, the power of the eXtreme Gradient Boosting (XGBoost) classifier to make precise distinctions among COVID-19-infected patients is harnessed.The EFS methodology amalgamates five distinctive feature selection techniques, encompassing correlation-based, chi-squared, information gain, symmetric uncertainty-based, and gain ratio approaches. To evaluate the effectiveness of the model, comprehensive experiments were conducted using a COVID-19 dataset procured from Kaggle, and the implementation was executed using Python programming. The performance of the proposed EFS-XGBoost model was gauged by employing well-established metrics that measure classification accuracy, including accuracy, precision, recall, and the F1-Score. Furthermore, an in-depth comparative analysis was conducted by considering the performance of the XGBoost classifier under various scenarios: employing all features within the dataset without any feature selection technique, and utilizing each feature selection technique in isolation. The meticulous evaluation reveals that the proposed EFS-XGBoost model excels in performance, achieving an astounding accuracy rate of 99.8%, surpassing the efficacy of other prevailing feature selection techniques. This research not only advances the field of COVID-19 patient classification but also underscores the potency of ensemble-based feature selection in conjunction with the XGBoost classifier as a formidable tool in the realm of medical diagnosis and classification.
Keywords: COVID-19, machine learning, classification, ensemble-based feature selection, XGBoost
DOI: 10.3233/IDA-230854
Journal: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-18, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]