Prediction Model for Unfavorable Outcome in Spontaneous Intracerebral Hemorrhage Based on Machine Learning
Article information
Abstract
Objective
The spontaneous intracerebral hemorrhage (ICH) remains a significant cause of mortality and morbidity throughout the world. The purpose of this retrospective study is to develop multiple models for predicting ICH outcomes using machine learning (ML).
Methods
Between January 2014 and October 2021, we included ICH patients identified by computed tomography or magnetic resonance imaging and treated with surgery. At the 6-month check-up, outcomes were assessed using the modified Rankin Scale. In this study, four ML models, including Support Vector Machine (SVM), Decision Tree C5.0, Artificial Neural Network, Logistic Regression were used to build ICH prediction models. In order to evaluate the reliability and the ML models, we calculated the area under the receiver operating characteristic curve (AUC), specificity, sensitivity, accuracy, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR).
Results
We identified 71 patients who had favorable outcomes and 156 who had unfavorable outcomes. The results showed that the SVM model achieved the best comprehensive prediction efficiency. For the SVM model, the AUC, accuracy, specificity, sensitivity, PLR, NLR, and DOR were 0.91, 0.92, 0.92, 0.93, 11.63, 0.076, and 153.03, respectively. For the SVM model, we found the importance value of time to operating room (TOR) was higher significantly than other variables.
Conclusion
The analysis of clinical reliability showed that the SVM model achieved the best comprehensive prediction efficiency and the importance value of TOR was higher significantly than other variables.
INTRODUCTION
Throughout the world, the spontaneous intracerebral hemorrhage (ICH) remains a significant cause of mortality and morbidity. Previous studies showed that excellent medical treatment, including timely removal of hematoma, management of blood pressure and glucose and control of intracranial pressure, can improve the outcome of ICH [6,10]. In recent years, it was demonstrated that evacuation of hematoma with minimally invasive methods could potentially increase microcirculation in the afflicted brain areas [7,8]. Several studies have revealed that individuals who had their surgery performed sooner have better results. Other research has found that surgery during the acute phase is related with a greater likelihood of postoperative rebleeding [15,19]. Consequently, the optimal time of ICH surgical evacuation to produce favorable results has yet to be determined. All in all, the factors influenced the outcome of ICH were still controversial. The majority of ICH predictor studies have used univariable and multivariable logistic regression analysis and the accuracy of these models is generally poor [5,22]. Consequently, it is essential to develop new models for evaluating the results of ICH utilizing new technologies, such as machine learning (ML).
The algorithms used in ML could extract patterns from huge datasets with numerous variables. Because of its capacity to execute complicated pattern recognition and uncover nonlinear contributions in big data sets, ML algorithms have the potential to enhance event and outcome prediction [4]. Once the result label is established, ML algorithms may autonomously optimize its parameters with minimal supervision. Unlike regression models, ML can deal with vast volumes of data and patient features while accounting for all of their relationships. As a result, ML algorithms have the potential to outperform regression models in terms of predicted accuracy.
The purpose of this retrospective study is to develop multiple models for predicting ICH outcomes using ML. We also examined the prediction accuracy of the ML models and selected the best model for clinical use.
MATERIALS AND METHODS
Subjects
Patient information was obtained from the Hospital’s computerized medical records. The Institutional Review Board (IRB) of Qingdao Hospital, University of Health and Rehabilitation Sciences (Qingdao Municipal Hospital) (ID : 082). Because the study was retrospective, the IRB did not require informed consent.
Between January 2014 and October 2021, we included ICH patients identified by computed tomography or magnetic resonance imaging and treated with surgery, including minimally invasive aspiration and standard craniotomies. We ruled out the following scenarios : 1) intracerebral bleeding caused by an aneurysm, arteriovenous malformation, or tumor; 2) patients suffering from severe end-organ failure, intracranial and systemic infections, and blood disorders; and 3) patients lost to follow-up (Fig. 1).
Age, sex, body mass index (BMI), systolic and diastolic blood pressure, midline shift, operation type and time, residual hematoma volume, and time to operating room, as well as premorbid chronic conditions (diabetes, hypertension, hyperlipidemia), were gathered.
Surgery and follow-up
The recommendations from the American Heart Association and American Stroke Association published in Stroke, 2015 were used to treat every patient [20]. Our hospital’s on-call neurosurgeons choose between doing a normal craniotomy or a minimally invasive aspiration employing neuron navigation (XPS Nexus; Medtronic, Minneapolis, MN, USA) depending on the patient’s circumstances. Every patient was monitored for at least 6 months. At the 6-month check-up, outcomes were assessed using the modified Rankin scale (mRS). In this investigation, we used the mRS score to categorize patient outcomes, with mRS 0–2 representing positive results and mRS 3–5 representing poor outcomes.
ML
In this study, four ML models, including Support Vector Machine (SVM), Decision Tree C5.0, Artificial Neural Network (ANN), Logistic Regression (LR) were used to build ICH prediction models. We evaluated each model’s performance using the area under the receiver-operator curve (AUC). The ML was performed with IBM SPSS Modeler 14.1 (IBM Corp., Armonk, NY, USA). A classifier known as SVM uses kernels to turn input data into a multidimensional hyperplane in order to distinguish between two groups. By deriving decision rules from the training data, decision trees predict class membership. The term "artificial neural network" (ANN) refers to a large parallel and linked network made up of straightforward adaptable components. LR is a binary dependent variable regression model (Fig. 2).
Statistical analysis
The categorical data were expressed as percentage and the continuous data were expressed as mean with standard deviation. We used the t test or Mann-Whitney U test for categorical data to compare the groups. Fisher’s exact tests or the chisquare test were applied to continuous variables. In order to evaluate the reliability and the ML models, we calculated the AUC, specificity, sensitivity, accuracy, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR). If the p-value was less than 0.05, we regarded the result as statistically significant. The statistical analyses were performed with IBM SPSS Statistics 26 (IBM Corp.).
RESULTS
Characteristics of the population
We identified 71 patients who had favorable outcomes and 156 who had unfavorable outcomes (Fig. 1). Between the two groups, there were no significant differences in terms of age, hypertension, sex, BMI, hyperlipidemia, hematoma side and location, residual hematoma volume, operation time, and type (Table 1).
However, two groups showed significant differences in the percentage of diabetes, systolic blood pressure (SBP), diastolic blood pressure (DBP), Glasgow coma scale (GCS), midline shift and hematoma volume. The details were showed in the Table 1. As the continuous data, we found there was no significant difference in the time to operating room (TOR) between two groups. Following that, we divided the cases into three categories : 1) <12 hours, 2) ≥12 and ≤36 hours, and 3) >36 hours. We discovered the patients who had surgery between 12 and 36 hours were more likely to have positive outcomes than the other groups (p<0.001) (Table 1).
ML models
We established four ML models. For the SVM model, the AUC, accuracy, specificity, sensitivity, PLR, NLR, and DOR were 0.91, 0.92, 0.92, 0.93, 11.63, 0.076, and 153.03, respectively. For the Decision tree C5.0, the AUC, accuracy, specificity, sensitivity, PLR, NLR, and DOR were 0.96, 0.87, 0.88, 0.87, 7.25, 0.148, and 48.99, respectively. For the ANN model, the AUC, accuracy, specificity, sensitivity, PLR, NLR, and DOR were 0.91, 0.91, 0.82, 0.97, 5.39, 0.037, and 145.68, respectively. For the LR model, the AUC, accuracy, specificity, sensitivity, PLR, NLR, and DOR were 0.77, 0.81, 0.73, 0.85, 3.15, 0.205, and 15.37, respectively. The results above showed that the SVM model achieved the best comprehensive prediction efficiency (Fig. 3 and Table 2).
We also calculated the variable importance values in the four models. For the SVM and LR model, we found the importance value of TOR was higher significantly than other variables. For the Decision tree C5.0 model, the sequence of importance value was midline shift, GCS, SBP and TOR. For the ANN model, the sequence of importance value was midline shift, TOR, DBP, and SBP. Moreover, the importance values of the above variables in the Decision tree C5.0 and ANN model were similar (Fig. 4).
DISCUSSION
In the study, we established four ML models including the SVM, Decision tree C5.0, ANN, and LR models. The analysis of clinical reliability showed that the SVM model achieved the best comprehensive prediction efficiency. For the SVM and LR model, we found the importance value of TOR was higher significantly than other variables. While, the importance values of the variables in the Decision tree C5.0 and ANN model were similar.
Spontaneous ICH is a form of stroke that is associated with a high death and disability rate. Regardless of therapy, physicians, families, and patients are most concerned with prognosis. Previous research has linked a variety of characteristics to a poorer prognosis following ICH, including advancing years, a lower GCS score, a greater hematoma volume and infratentorial hemorrhage [18,24]. Cheung and colleagues demonstrated that in a sample of 142 patients, the ICH score comprising the GCS, age, and ICH volume, helped to predict favorable prognosis with a sensitivity of 93.5% [3]. In 310 patients, Ruiz-Sandoval et al. [21] published an ICH score system with sensitivity of 70.0% for 30-day excellent outcome. Previously described prediction approaches, on the whole, concentrated on shortterm prognosis and had very poor accuracies. In our study, we found that the SVM model achieved the best comprehensive prediction efficiency and the importance value of TOR was higher significantly than other variables. While, the timing of surgery for ICH is still debatable. The previous study showed that the TOR<12 hours was early time windows [12]. Wang et al. [25] investigated ultra-early (or <7 hours), early (7–24 hours), and delayed (>24 hours) surgery in spontaneous ICH and discovered that surgery conducted within the 7–24 hours window resulted in the best outcome. Early surgery was linked to a greater risk of rebleeding, whereas delayed surgery was linked to severe consequences [11]. Moreover, in my previous paper, we found that the 21 hours of time to operation room was the turning point of outcome and the best outcome is obtainable at 12–36 hours surgery [14]. Consequently, we TOR was divided into three categories : 1) <12 hours, 2) ≥12 and ≤36 hours, and 3) >36 hours. Our results were consistent with previous studies.
ML techniques have been employed in medical research and frequently outperform traditional statistical models. ML has recently been used to the severity or outcome prediction model for neurological illnesses such as ischemic stroke and traumatic brain injury in research [2,13,23]. In aneurysmal subarachnoid hemorrhage, Maldaner et al. [16] discovered that ML models exhibit strong discrimination and calibration for a favorable functional result. Using decision tree analysis, Heschl et al. [9] discovered that patients with a high GCS score, less than 44.5 mL hematoma volume, and a relatively low premorbid mRS score have a changeable prognosis. However, the use of ML to predict outcomes following ICH is still uncommon. The majority of research looking for ICH predictors have used univariable and multivariable LR analysis. In general, the accuracy of these regression models is poor. In our study, we found that the SVM model achieved the best comprehensive prediction efficiency which was better than previous studies. SVM is a promising supervised ML that can learn from observing data sets, build complex models to capture the inherent relationships between input and output variables, and make data-driven predictions or decisions. It has been used successfully in a variety of biomedicine fields and has performed admirably [1,17,26]. The application of the SVM ML approach to predict the outcome of ICH was innovative in our studies. Our model achieved a sensitivity of 0.93, specificity of 0.92, and AUC of 0.91. This indicated that our model performed well in predicting the outcome of ICH and could be widely used in clinics.
There were several limitations to the study. To begin, this was retrospective research conducted at a single center. Large clinical data sets and multicenter validation are necessary to improve the model’s performance and to make precise comparisons among ML-based models. Furthermore, our study did not seek to pinpoint the precise reason of a particular good or bad result. Although our approach cannot replace clinical judgment, result prediction based on long-term human experience may exceed the best algorithm. In the future, more complex ML approaches will be applied to increase AUC.
CONCLUSION
The analysis of clinical reliability showed that the SVM model achieved the best comprehensive prediction efficiency than other models. For the SVM model, we found the importance value of TOR was higher significantly than other variables. Using ML techniques, we may be able to identify the factors that have been previously neglected and to develop new treatment methods to improve outcomes according to these factors.
Notes
Conflicts of interest
No potential conflict of interest relevant to this article was reported.
Informed consent
This type of study does not require informed consent.
Author contributions
Conceptualization : WW; Data curation : JZ; Formal analysis : XH, YW; Funding acquisition : ML, TL; Methodology : SL; Project administration : ZX; Visualization : FC; Writing - original draft : YZ; Writing - review & editing : ML
Data sharing
None
Preprint
None
Acknowledgements
This work was supported by the China’s government under grant of National Natural Science Foundation (#82001184&82001253).