Optimizing Mortality Prediction in Cardiac Patients Using Genetic Algorithm and Random Forest with Class Imbalance Handling
Abstract
This study presented a decision support system based on data mining and machine learning techniques for accurate prediction of cardiac patients' deaths because resources in the hospitals are limited; therefore, appropriate allocation of resources will improve the survivability of cardiac patients. Data mining techniques are widely used by researchers to uncover hidden information and patterns that could potentially save or prolong patient lives. Age, gender, high blood pressure, cholesterol, and irregular heartbeat rates are some of the variables considered for this study. This study used medical health records data from 368 observations with 55 unique features. This work presents a machine learning-based approach for predicting cardiac patient death by employing an electronic health record (EHR) dataset. The constructed model is based on a genetic algorithm (GA) for selecting important features from the dataset and a Random Forest (RF) model for classifying mortality in cardiac patients. The hyperparameters of RF models were optimized using a grid search algorithm for improved performance of RF. A public dataset was obtained in order to evaluate the efficacy of the constructed GA_RF model. One of the problems that we encountered during this study was imbalance classes in the collected dataset. The machine learning models tend to bias toward the majority class in the dataset. To overcome this problem, the Random Under Sampling (RUS) method was employed. The performance of the constructed GA_RF model was tested on several evaluation metrics, and results validate the effectiveness of the proposed GA_RF model for mortality prediction in cardiac patients.
Downloads
Published
Versions
- 2024-11-26 (2)
- 2024-11-26 (1)