Affiliations: [a] Department of Information Technology, St. Vincent Pallotti College of Engineering & Technology Gavsi Manapur, Wardha Road, Nagpur 440018,Maharashtra, India | [b] Department of Computer Science & Engineering, Rao Bahadur Y Mahabaleswarappa College of Engineering (RYMEC), Ballari 583104,Karnataka, India
Abstract: Imbalanced data classification (IDC) presents a significant challenge in data mining (DM), as it frequently occurs in various real-world areas with profound implications for highly skewed databases. IDC revolves around the task of learning from data characterized by a substantial imbalance in the number of samples across its different classes. Hence the Polar-CanisFel (PCF) Optimization-deep ensemble model is designed to address imbalanced big data issues, incorporating the SMOTE technique for rebalancing the dataset. This ensemble classifier leverages a deep convolutional neural network (DCNN), Long Short-Term Memory (LSTM), and Gated Recurrent Neural Network (GRNN) architectures for effective data classification. For the Heart Failure Prediction Dataset, the model reaches an accuracy of 96.35%, sensitivity of 94.54%, and specificity of 96.11%. Further, the accuracy of 95.91%, sensitivity of 95.87%, and specificity of 94.79% are obtained concerning the Stroke Prediction dataset. Finally, when applied to the Hepatitis-C prediction dataset, the model attains an accuracy of 92.79%, sensitivity of 92.90%, and specificity of 92.63% during 90% of training.
Keywords: Imbalanced data classification, SMOTE, Data balancing, Polar-CanisFel optimization, Deep ensemble classifier
DOI: 10.3233/WEB-230248
Journal: Web Intelligence, vol. Pre-press, no. Pre-press, pp. 1-22, 2024