Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Machine Learning Modeling of Disease Treatment Default: A Comparative Analysis of Classification Models

Machine Learning Modeling of Disease Treatment Default: A Comparative Analysis of Classification... Hindawi Advances in Public Health Volume 2023, Article ID 4168770, 11 pages https://doi.org/10.1155/2023/4168770 Research Article Machine Learning Modeling of Disease Treatment Default: A Comparative Analysis of Classification Models Michael Owusu-Adjei , James Ben Hayfron-Acquah , Frimpong Twum , and Gaddafi Abdul-Salaam Department of Computer Science, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana Correspondence should be addressed to Michael Owusu-Adjei; mowusuadjei@st.knust.edu.gh Received 14 November 2022; Revised 21 December 2022; Accepted 23 December 2022; Published 3 January 2023 Academic Editor: Chandrabose Selvaraj Copyright © 2023 Michael Owusu-Adjei et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Generally, treatment default of diseases by patients is regarded as the biggest threat to favourable disease treatment outcomes. It is seen as the reason for the resurgence of infectious diseases including tuberculosis in some developing countries. Sadly, its occurrence in chronic disease management is associated with high morbidity and mortality rates. Many reasons have been adduced for this phenomenon. Exploration of treatment default using biographic and behavioral metrics collected from patients and healthcare providers remains a challenge. Te focus on contextual nonbiomedical measurements using a supervised machine learning modeling technique is aimed at creating an understanding of the reasons why treatment default occurs, including identifying important contextual parameters that contribute to treatment default. Te predicted accuracy scores of four supervised machine learning algorithms, namely, gradient boosting, logistic regression, random forest, and support vector machine were 0.87, 0.90, 0.81, and 0.77, respectively. Additionally, performance indicators such as the positive predicted value score for the four models ranged between 98.72%–98.87%, and the negative predicted values of gradient boosting, logistic regression, random forest, and support vector machine were 50%, 75%, 22.22%, and 50%, respectively. Logistic regression appears to have the highest negative-predicted value score of 75%, with the smallest error margin of 25% and the highest accuracy score of 0.90, and the random forest had the lowest negative predicted value score of 22.22%, registering the highest error margin of 77.78%. By performing a chi-square correlation statistic test of variable independence, this study suggests that age, presence of comorbidities, concern for long queuing/waiting time at treatment facilities, availability of qualifed clinicians, and the patient’s nutritional state whether on a controlled diet or not are likely to afect their adherence to disease treatment and could result in an increased risk of default. the leading cause of deaths according to the report in [1] 1. Introduction among the population in 2017 and ranked third as the Statistical data compiled [1] from all the regional and district biggest cause of patient admission in hospitals countrywide. hospitals except the two leading teaching hospitals (Kor- Tese statistical descriptions are also confrmed in [2], which le-Bu Teaching Hospital and Komfo Anokye Teaching attributes 13 percent of total global deaths to the incidence of Hospital) in Ghana show a depressing outlook on the in- hypertension and estimates them to be the largest risk factor cidence of hypertension and other cardiovascular diseases for deaths globally. that has resulted in increase patient mortality and morbidity Cardiovascular diseases including hypertension are rates with devastating consequence. Tis has been described chronic diseases that require long-term treatment man- as a consequence of patient nonadherence/default or non- agement. Recent reports healthcare personnel [1, 2] con- compliance to prescribed treatment appointments. Hyper- frming increasing mortality rates among sufering patients tension and associated cardiovascular diseases were listed as require a critical look at treatment adherence together with 2 Advances in Public Health therefore explores contextual biographic and behavioral default risks by considering both biographic and contextual behavioral variables among patients and healthcare givers. traits of both patients and healthcare providers towards disease treatment default using empirical data from a real Te importance of behavioral analysis in healthcare is underscored by the authors of [3], whose work afrms that context with reliance on nonbiomedical variable metrics. patient behaviors are important considerations for disease treatment including the determination of disease causality. 2. Related Work Further studies to determine the impact of behavior on patients stipulate that performance behavior of patients is A comparative analysis of diferent machine learning aimed at preventing disease occurrence, detecting onset of modeling techniques within the healthcare delivery system diseases, and improving disease treatment outcomes, but has been examined in various studies and related research admittedly, the anticipated outcomes are also infuenced by works. Tis section discusses prediction accuracy scores of the behavior of healthcare professionals [4]. Teir defnition machine learning techniques used within the context of of behavior by health care professionals emphasizes on healthcare practice to highlight the signifcance of machine patient needs. behavior that takes into account the needs of learning types in classifcation-based problem domains with patients [4]. Additionally, studies regarding healthcare biomedical measurements. A study [13] to predic biabetes provider behavior and its impact on patients, healthcare disease with biomedical and biographic variables such as age, working environments, colleague workers, etc., are em- education, systolic and diastolic blood pressure, body mass phasized by the author of [5], with a tacit admission that index, direct and cholesterol using learning algorithms there are negative outcomes of disrespectful behavior, some namely; Naive Bayes, Decision Tree, Adaboost and Random of which cause recipients to experience fear and feel isolated, Forest showed auc score of 0.95 for Logistic regression and including many others. Te impact of negative consequences Random forest. Tis study therefore concludes that the of these behavioral traits by healthcare providers on patient combination of logistic regression with the random forest treatment outcomes shows that disease treatment default by showed superior performance and could be used in pre- patients can be established in context. Tis includes the dicting diabetic patients. consideration of contextual variables about both parties Similar predictive studies [14] with the disease pre- (healthcare providers and patients). Applying predictive diction framework using machine learning techniques for algorithms with the capability of showing hidden patterns of diabetes healthcare with a combination of biographic and information can assist in this endeavor. biomedical data such as age, blood pressure, glucose, and Predictive algorithms have become powerful in- insulin body mass index with algorithms such as K-nearest struments for businesses (big and small) as a competitive neighbor (KNN), support vector machine (SVM), logistic tool. In mortgage fnancial decisions, including assessing regression, and random forest concluded that logistic re- payment behavior of homeowners [6], predictive analytics gression had superior performance with a receiver operating have been employed to study and understand the behavior of characteristic curve (ROC) score of 86%. Similarly, an homeowners. Te other use of predictive algorithms is analysis of decision trees for diabetes prediction using both providing insights into future human behavior based on biomedical and biographic variables such as gender, plasma, present or available information [7]. In the healthcare in- insulin, glucose, body mass index, and blood pressure with dustry, predictive algorithms have been used in the area of decision tree algorithms such as leaf area density (LAD), classifcation problems such as predicting the patient’s na¨ıve Bayes, and genetic j48 also showed predicted accuracy nonattendance to a scheduled appointment clinic [8]. scores of 88%, 92%, and 95.8%, respectively [15]. Further use of predictive algorithms is associated with early Further studies on detailed analysis of kidney and heart detection and diagnoses of diseases for preventive medical disease prediction with machine learning using two distinct care due to increased medical treatment costs together with datasets consisting of both biomedical and biographic increasing morbidity and mortality rates [9]. Advances in variables such as age, blood pressure, specifc gravity, sugar, technology, coupled with increased production of data albumin, red blood cells, puss cells, pus cell clumps, bacteria, within healthcare systems, have heightened research in- blood glucose level, blood urea, sex, education, current terests in healthcare applications for knowledge discovery smoking status, cigarettes smoked per day, BP medicines, and insights into patterns of change [10]. Predictive machine and prevalent smoking also showed accuracy scores of 100% learning algorithms have therefore become useful in pro- for chronic kidney disease and 85% for heart disease. Ma- viding user-centered explanations of the factors that lead to chine learning algorithms such as logistic regression, K- an increased risk of adverse outcomes in healthcare nearest neighbor, na¨ıve Bayes, support vector machine, and settings [11]. random forest [16] also produced accuracy scores of 100% Tere have been several predictive studies of disease for chronic kidney disease and 85% for heart disease based diagnosis and detection on hypertension and other car- on a support value of 750 for all the algorithms. diovascular diseases predominantly using contextual vari- Further studies for heart disease classifcation predicted ables such as biomedical and biographic metrics. Behavior is by the authors of [17], which aim to explore feature selec- mentioned among the fve broad determinants of health tions with a combination of chi-squared feature selection [12]. However, studies requiring the use of behavior together methods and the Bayes net algorithm, achieved an accuracy with either biographic or biomedical metrics for disease score of 85.00% using both biomedical and biographic treatment in this regard have been limited. Tis research variables such as age, sex, measuring values of fasting blood Advances in Public Health 3 prediction of criminal recidivism with a sampled population sugar, resting blood pressure, serum cholesterol, resting electrocardiographic reports, maximum heart rates, and of 365. Te other distinct feature of the observed related works is the use of sourced datasets from studies conducted number of major vessels colored by fuoroscopy. A personalized modeling and prediction approach with elsewhere. Te uniqueness of this study is the use of em- internet-connected smart devices for data generation [18] pirical data collected from a real-world context by assessing used weighted voting logistic regression and random forest entities involved in disease treatment management for the machine learning techniques for type 2 diabetes prediction prediction of treatment default. and produced a model accuracy score of 0.884 with both biomedical and biographic metrics such as age, gender, body 3. Research Contribution mass index, cholesterol level, marital status, employment One of the accomplishments in this research work is the use status, and income level. of biographic and behavioral metrics of both patients and Further predictive studies on the presence or absence of healthcare providers to predict disease treatment default of heart disease using techniques such as the backpropagation patients sufering from hypertension with and without multilayer perceptron [19] showed an improved model comorbidities. performance accuracy score of 96.30% with features such as age, sex, fasting blood sugar, and resting blood sugar. Using several machine learning techniques for supervised and 3.1. Research Hypothesis unsupervised learning in diabetes research predictions [10], it was concluded that the use of support vector machines 3.1.1. Null Hypothesis H . No relationship exists between proved to be the most successful while using clinical datasets disease treatment default and patient gender. with features extracted from demographics, diagnoses, disease comorbidities and symptoms, medications, labora- 3.1.2. Alternative Hypothesis H . Te relationship exists tory measurements, and other procedures. Furthermore, between disease treatment default and patient gender. a predictive analysis of diabetic complications [20] using A comparative analysis of four machine learning models, naive Bayes tree, C4.5 decision tree-based classifcation, and namely, logistic regression, gradient boosting classifer, k-means clustering techniques with features such as age, support vector machine (SVM), and random forest classifer gender, body mass index, family history of diabetes, blood is examined to determine the best predicted values such as pressure, duration of onset, and blood glucose level showed the true positive rate (TPR), false positive rate (FPR), an overall model accuracy of 68%. A similar study to identify positive predicted value (PPV), and negative predicted value correlated variables such as demographic, clinical, and (NPV) of each algorithm for performance evaluation. healthcare resource utilization variables for the diagnosis of Tis is accomplished in the listed steps as follows: diabetic peripheral neuropathy using the random forest algorithm/technique achieved an ROC model performance Step 1. Variable independence using the chi-square correlation statistic is determined to establish the of 0.824 and a model accuracy of 89.6% with a 95% con- fdence interval [21]. A similar study with variables such as gender relationship between the output class. the number of prior convictions, age, type of index ofence, Step 2. Optimal threshold performance is determined diversity of criminal history, and substance abuse to predict with the application of threshold optimization using general criminal recidivism in mentally disordered ofenders the area under the receiver operating using the random forest technique also produced an AUC characteristic curve. score of 0.90 [22]. In diagnosing heart disease for diabetic Step 3. Te predicted accuracy scores of the true patients using variables such as age, sex, blood pressure, and positive rate (TPR), false positive rate (FPR), positive blood sugar in predicting the chances of a diabetic patient predicted value (PPV), and negative predicted value developing heart disease, naıve Bayes and support vector (NPV) have been demonstrated using the four models. machines showed signifcant prediction accuracy [23]. Al- Step 4. Te predicted scores such as the true positive ternatively, a study [24] to identify the optimal model that rate (TPR), false positive rate (FPR), positive predicted predicts HBsAg seroclearance of patients sufering from value (PPV), and negative predicted value (NPV) to chronic hepatitis B with selected variables such as age, estimate how well these models would perform in gender, family history, body mass index, and drinking subsequent predictions have been comparatively history used four machine learning algorithms, namely, evaluated. extreme gradient boosting (XGBoost), random forest (RF), decision tree (DT), and logistic regression (LR) and iden- Comparative performance evaluation determines which tifed XGBoost to show superior model performance in machine learning algorithm performs better given the re- identifying predictive variable importance with an AUC quired dataset format. score of 89.1%. In the studies discussed above and many others, refer- 4. Materials and Methods ences are made of predicted accuracy scores (AUC) of varying percentages from various machine learning algo- Te electronic medical record dataset of patients was ob- rithms using biomedical and biographic variables. Te use of tained through an institutional request for permission as per biographic and behavioral metrics is noted in [22] for the the request reference number DCS/S.1/VOL.1 from Kwahu 4 Advances in Public Health Government Hospital (a district healthcare facility in the Age Bracket Frequency Distribution eastern region of Ghana). A total of 5,333 patients were identifed as sufering from hypertension, and some of them were also sufering from both hypertension and other dis- eases (comorbidities). As a measure to protect the privacy and confdentiality of patients involved by the institution, individual patient names and location were omitted from the records. Te features used were selected from clinical notes, and biographic data were obtained. Te classifcation-based machine learning algorithms used are gradient boosting classifer, logistic regression, support vector machine, and random forest. All software developments were performed using Python version 3.10.5 and its packages for all data processing steps. Te results obtained from data modeling are presented in the sub- sequent section. Age Bracket Figure 1: Age bracket distribution. 4.1. Results. Te sample size was 5333 patients which been illustrated for emphasis on collected behavioral and consisted of 4,312 females constituting 80.86% of the total biographic variables used. Certain processes had been sample population and 1,021 males constituting 19.14% of highlighted for emphasis about the behavior of health care the total sample population. Te statistical description of age professionals especially on delivery of service. bracket and distribution density among those sampled is Figure 7 shows a fowchart diagram illustrating various illustrated in Figures 1 and 2, respectively. stages and processes relevant to building machine learning Te highest age bracket distribution from the sampled algorithms for prediction. It includes subprocesses, decision data according to Figure 1 is found among the ages between points, and task evaluation points, as well as arrow pointers 66 and 76, and the least age bracket distribution is found for task routes. Tere are processes that depend on other between 110 and 120. Age distribution density is higher processes to begin, creating a dependent rule. In Figure 7, the between ages 44 and 54 up to 88 and 98. Gender distribution dependent process can be seen in the data processing stage is displayed in Figure 3. where many subprocesses are defned before the next task, Basic categories of the sampled data were patients di- which is splitting of the dataset. At certain stages of the agnosed with hypertension only and those who had been process, decisions to determine the next course of action are diagnosed with hypertension and other cardiovascular made. In machine learning prediction processes, there is diseases such as diabetes. Figure 4 shows the frequency beginning and ending of all processes. A green oval shows distribution of patients with hypertension only (indicated by beginning of processes, whereas a red oval shows ending of blue color) and hypertension with comorbidities (indicated processes. by red color). Te female patient population accounted for In order to establish variable relationships for the output the highest number of patients with hypertension only and variable, the chi-square correlation statistic test was per- hypertension with comorbidities, as illustrated in Figure 4, formed on all the input variables. Tose with statistical against their male counterparts. signifcance are selected and described as follows. In Figure 3, the frequency distribution among gender indicates a higher proportion of females with fewer males. Representation of output class distribution in Figure 5 4.1.1. Chi-Square Correlation Statistic Score shows that negative class patients identifed as nondefaulters were far more than those identifed as defaulters to treatment On controlled diet: default. p value � 4.577598097458345e − 32 For the new patient, the onset of treatment begins with Chi-square statistic value � 153.02396213387735 signs and symptoms of discomfort, whereas for an existing Dependent (reject H0) patient undergoing treatment, it begins with a visit as a result of a scheduled appointment or realization of signs and Availability of a physician: symptoms of discomfort. Figure 6 shows the disease treat- p value � 2.8205192634562076e − 55 ment process for both new and existing patients sufering Chi-square statistic value � 260.9682907753015 from hypertension and other forms of cardiovascular dis- Dependent (reject H0) ease. In Figure 6, the rectangles indicate processes and subprocesses, the arrows show task routes, the rhombus Concern for long waiting/queuing time: show decision boundaries, and the oval represents the start p value � 1.239782911843141e − 25 or end of processes. Deepened processes with texts show Chi-square statistic value � 122.96922267902593 data collection points, but for this research work, no measuring biomedical item was collected. Tese points have Dependent (reject H0) Frequency 11-21 22-32 33-43 44-54 55-65 66-76 77-87 88-98 99-109 110-120 121-131 Advances in Public Health 5 Age Distribution Density 0.035 0.030 0.025 0.020 0.015 0.010 0.005 0.000 20 40 60 80 100 120 age Figure 2: Age distribution density histogram. Frequency Distribution by Gender female male Gender Figure 3: Frequency distribution by gender. Comorbidity distribution based on gender female male gender cmbd yes no Figure 4: Comorbidity distribution among sampled gender. Hypertension with comorbidities: Dependent (reject H0) p value � 2.4832105751483715e − 54 Patient’s age: Chi-square statistic value � 256.5842340588116 p value � 4.4969165821531835e − 05 Density count Frequency 6 Advances in Public Health Treatment Outcome Class Distribution 1 0 Output Class Figure 5: Treatment output class distribution. Disease Treatment Process Flowchart Existing patient Start yes no review/schedule new patient/signs & personal records appointment symptoms medical laboratory tests sub processes diagnostic investigation radiological investigations vital signs check vital signs/physical assigned to a clinician investigation report review review assessment start/continue yes confirm diagnosis/new End treatment patient no Figure 6: Flowchart diagram of the disease treatment process. Chi-square statistic value � 259.20561138258796 hypertension with comorbidities, and the patient’s age are estimated to infuence the dependent variable as null hy- Dependent (reject H0) pothesis is rejected. Gender: A comparative display of the model confusion matrices p value � 0.4911070922909865 obtained is shown. Figure 8 shows a collection of the various confusion Chi-square statistic value � 3.4137909069569465 matrices obtained from the four machine learning models. Independent (fail to reject H0) Each matric has four sections indicating specifc values. Te chi-square correlation statistic test was performed Description of colored section in the confused matrix named on (selected for its signifcance) input variables which are from the top left section (True positive-TP), yellow shaded described above. It shows p values and correlation status of section on the bottom right (True negative-TN), upper right the dependent variable. Patients admitted to be on a con- corner (False negative-FN) and bottom left corner (False trolled diet, physician availability to examine patients, positive-FP). In Figure 8(a), gradient boosting correctly concern for longer waiting or queuing time at healthcare classifed 1313 patients as nondefaulters and misclassifed 15 facilities, patients sufering from both hypertension and nondefaulters as defaulters and 3 default patients as Frequency Advances in Public Health 7 Model Building Process Data collection Data processing train model fit model Start Select/classification model remove Define input/ check for null check for models prediction output variables values outliers duplicates Sub processes no yes split into compute dimensionality label data scaling training/testing accuracy check encoding datasets score quality check ok? End Evaluate/improve? hyperparameter tuning no yes Figure 7: A fowchart diagram of a model building process. Te rectangles show processes, the arrows indicate task routes, and the rhombus show decision points and circles depict the start or end of processes (circular green, start and circular red, end). Gradientboosting Confusion Matrix Logistic Regression Confusion Matrix 1200 1200 1000 1000 Defaulted 3 15 Defaulted 315 800 800 600 600 400 400 Not Defaulted 3 1313 Not Defaulted 1 1315 200 200 Defaulted Not Defaulted Defaulted Not Defaulted Predicted label Predicted label (a) (b) Random Forest Confusion Matrix Support vector Confusion Matrix 1000 1000 Defaulted 2 16 Defaulted 1 17 800 800 600 600 Not Defaulted 7 1309 Not Defaulted 1 1315 200 200 Defaulted Not Defaulted Defaulted Not Defaulted Predicted label Predicted label (c) (d) Figure 8: Model confusion matrices for (a) gradient boosting, (b) logistic regression, (c) random forest, and (d) support vector machine. nondefaulters. In Figure 8(b), logistic regression correctly as a nondefaulter. In Figure 8(c), the random forest correctly classifed 1315 patients as nondefaulters and misclassifed 15 classifed 1309 nondefault patients and misclassifed 16 nondefault patients as default patients and 1 default patient nondefault patients as defaulters and 7 default patients as True label True label True label True label 8 Advances in Public Health Table 1: Specifcity and sensitivity test scores. Model FNR (%) TNR (%) FPR (%) PPV (%) NPV (%) TPR (%) Gradient boosting 0.23 16.67 83.33 98.87 50.0 99.77 Logistic regression 0.08 16.67 83.33 98.87 75.0 99.92 Random forest 0.53 11.11 88.89 98.79 22.22 99.47 Support vector machine 0.08 5.56 94.44 98.72 50.0 99.92 Te highlighted scores tell us how much trust to put into a test result. Can a negative test result be truly trusted or not. probability of trust in a test result. Negative predicted value is the ratio of patients truly predicted as defaulters to all patients diagnosed as defaulters. It is a probability estimate for a nondefault status if described as a default patient. High percentage value means that the probability for a prediction miss is lower (minimal error). Lower percentage means a very high probability for a prediction miss. nondefaulters. Finally, in Figure 8(c), the support vector Table 2: Precision, recall, and F1 score weighted macro average machine correctly classifed 1315 nondefault patients and scores. misclassifed 17 nondefault patients as defaulters and 1 Classifcation report default patient as a nondefaulter. Model Precision Recall F1 score Te predicted probability score rates in percentages are illustrated in Table 1. Gradient boosting 0.98 0.99 0.98 Logistic regression 0.98 0.99 0.98 Specifcity and sensitivity both describe test accuracies Random forest 0.98 0.98 0.98 for the predicted output. How well a model is able to identify Support vector machine 0.98 0.99 0.98 true positives in a diagnostic test is referred to as its sen- sitivity, and conversely, a model’s measure of true negatives is termed its specifcity. Te false negative rate (FNR) is the machines at 50% each, with the random forest recording the probability that a true positive result will not be true or least percentage score. Tis means that if a predicted result of missed. logistic regression is indicated as 75% for the NPV, then In Table 1, the FNR, as recorded by the four models, is there is a 75% chance it is indeed negative. For the random 0.23% for gradient boosting, 0.08% for logistic regression, forest classifer, a 22.22% NPV is indicative of only 22.22% 0.53% for the random forest, and 0.08% for the support chance of a negative result being accurate. vector machine. Te probability of a true positive result Te true positive rate (TPR) is the probability that an being missed is much higher by the random forest classifer actual positive will test positive. It is also known as a sen- than any of the other models. Logistic regression and the sitivity test. Te TPR values for all models as recorded in support vector machine record the least probabilities of Table 1 are greater than 99%. a true positive result being missed. From the classifcation report shown in Table 2, the Te true negative rate (TNR) is the probability that a true weighted average macro precision scores for the individual negative result will be missed by the model. Te results models were above 0.98, making F1 score higher above 0.9. obtained, as shown in Table 1, show the TNR rate for various Te high weighted macroaverage F1 score is obtained for an models as follows: 16.67% for gradient boosting, 17.67% for imbalanced dataset (Figure 5) where the individual class logistic regression, 11.11% for the random forest classifer, score contribution is weighted by its size. High weighted and 5.56% for the support vector machine. Te support macroaverage scores for precision, recall, and F1 score are vector machine is seen to have the least probability, followed indicative of prediction success. by the random forest, with logistic regression and gradient Te area under the receiver operating characteristic boosting having the same score value. curve (roc_auc) describes various model performances at Te false positive rate (FPR) also referred to as “fall out” separate threshold levels. Figure 9 shows a plot of two is described as the probability of a false alarm being raised. important parameters, the false positive rate and the true Te FPR values for the models, as displayed in Table 1, positive rate. In the ROC curve, model performance can be show that the support vector machine has the highest evaluated at diferent threshold points. Using the AUC record value of 94%, followed by the random forest (88%), technique, the aggregate model performance of the models with logistic regression and gradient boosting recording can be calculated across all threshold points. Te AUC score 83.33% each. for logistic regression is 0.90, 0.77 for the support vector Te positive predicted value (PPV) is the probability that machine, 0.87 for the gradient boosting classifer, and 0.81 a positive result is truly positive (a default patient is truly for the random forest, and these values indicate individual a defaulter). In Table 1, it can be seen that all the models model prediction accuracy. Te dotted black line shown in show higher positive predicted values above 98%, that is, the Figure 9 represents the baseline model, which is indicative of probability that the prediction of a default patient will not be poor model performance. missed. Te negative predicted value (NPV) is the proba- bility that a nondefault patient will also not be missed. Actual prediction of patients classifed as nondefaulters by the four 4.2.AlgorithmPerformanceMetricDiscussion. Tis section is models according to Table 1 shows that logistic regression divided into two parts: the frst part deals with a statistical shows a higher negative predicted value score of 75%, fol- description of the sampled data and the second part de- lowed by both gradient boosting and support vector scribes actual performance metrics of the models used, Advances in Public Health 9 Receiver operating characteristic curve 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 False Positive Rate SVC (AUC = 0.77) GradientBoostingClassifier (AUC = 0.87) LogisticRegression (AUC = 0.90) RandomForestClassifier (AUC = 0.76) Figure 9: Roc_auc curve. including results presented in various fgures, graphs, and Tis is of utmost importance for predictions in medical tables. diagnosis, disease detection, and other problems in the In Figure 1, a presentation of age bracket distribution of healthcare domain. Te accuracy of a negative or positive the sampled data shows that the incidence of hypertension result being truly negative or positive is essential to avoid and its associated diseases (comorbidities) is prevalent missing diagnosis, especially in diseases that require urgent among the age bracket 66–76. Te incidence of hypertension attention. It is for this reason that a logistic regression score alone and hypertension with comorbidities among gender, of 75% is considered critically important. as shown in Figure 4, indicates that the female population Additional performance evaluations with defnitions of had the highest prevalence rate than males. false negative rates and an explanation for the score are A determination of variable relationships using Chi- obtained. However, the random forest classifer shows square correlation statistic of input variables to estimate a probability score of 0.53%, approximated as 1, which gender, age, availability of a physician, hypertension with makes it very likely to miss a true positive result. As shown in comorbidities, queue/waiting time and controlled diet Figure 5, the output class contains data imbalance; therefore, presented under sub section Chi-square correlation Statistic it is important to consider loss based on proportion and the score produced three scores namely; Ch-square statistic use of weighted macroaverages. Te weighted macro- value, p value and variable correlation state. Comparison of averages for precision, recall, and F1 score according to the p value with alpha value of 1 results in output decision; classifcation report shown in Table 2 are 98% for gradient boosting, 99% for logistic regression, 98% for the random a rejection or failure to reject the null hypothesis. Te chi-square correlation statistic of variance showed forest, and 98% for the support vector machine. no correlation between treatment default and gender, Te additional evaluation metric used for model per- thereby failing to reject H However, other variables such as formance is the roc_auc curve, as shown in Figure 9. Tis O. curve is an aggregate measure across all threshold points for age, presence of comorbidities, concern for long waiting/ queuing time, controlled diet, and availability of a physician model performance. Te AUC scores shown in Figure 9 for showed correlation with the output variable (treatment the support vector machine model are 0.77, 0.81 for the default status), and null hypothesis was rejected. random forest, 0.90 for logistic regression, and 0.87 for Generally, all the models as shown in Table 1 show gradient boosting, respectively. capacity to predict true positive values, as shown by their From the various performance metrics, as shown in scoring percentages (gradient boosting, 99.77%; logistic Table 1 and Figure 9, the logistic regression model is con- sidered the obvious model of choice for this classifcation regression, 99.92%; random forest, 99.47%; and SVM, 99.92%). Te important additional performance metric in prediction. It has shown superior performance with an AUC score of 0.90, a negative predicted value (NPV) score of 75%, Table 1 with the highlighted text shows the negative predicted values (NPVs) of all the models. Te negative and an FNR of 0.08%. predicted values in Table 1 according to individual models show that logistic regression has a higher score of 75%, 5. Discussion followed by both gradient boosting and support vector machines scoring 50% each, with the random forest al- Te stated focus of this research was to use biographic and gorithm scoring the least score of 22.22%. behavioral metrics to predict disease treatment default and Tis was to defne what negative predicted value or score to test for null hypothesis as to whether gender biases have mean for predictions by linking it with the score value of any correlation with hypertension treatment default. Te logistic regression. Logistic Regression’s prediction of neg- use of the chi-square correlation statistic of variable in- ative values will be accurate at 75% rate. dependence as shown proves no dependency of patient True Positive Rate 10 Advances in Public Health Table 3: Contextual biographic and behavioral variables. Conflicts of Interest Biographic and behavioral metrics Te authors declare that there are no conficts of interest Variable Description regarding the publication of this paper. Age Patient’s age Gender Gender of the patient Authors’ Contributions Advreact Complaints of adverse drug reaction Herbal Use of herbal supplements Michael Owusu-Adjei conceptualized the study. Michael Missdose Default in medication Owusu-Adjei, Dr. Twum Frimpong, and Dr. Gaddaf Abdul- Smoking Smoking status Salaam designed the methodology. Prof. James Ben Hay- Exercise Any physical activity fron-Acquah supervised the study. Alcohol Use of alcohol Diet Whether on controlled diet Staf attitude Approval of healthcare provider attitude Acknowledgments Service_stf Service rating Attd_physician Availability of a clinician Te authors continue to acknowledge the support and co- Queue_time Concern for waiting time operation of management and staf of the Kwahu Gov- Cmbd Presence of comorbidities ernment Hospital, especially the Medical Director Dr. Defaulter_status Treatment status Kobina Awotwe Wiredu and the supervisory team led by Prof. James Ben Hayfron-Acquah, Dr. Twum Frimpong, and gender on treatment default but rejects the null hypothesis Dr. Gaddaf Abdul-Salaam. Tis publication is an extract with respect to other contextual variables. Further model from an academic research that is self-funded by the student. evaluations to determine superior model performance in Table 1 and Figure 9 also identify logistic regression as References having a higher NPV, as shown in Table 1, and SVM has the least probability score for an FNR (0.08%) and the [1] Ghana Health Service, “Te health sector in Ghana: facts and highest AUC score of 0.90, as shown in Figure 9. Based on fgures 2018,” Ministry Health Ghana, pp. 1–50, 2018. the above statistical values, the logistic regression model [2] WHO, Global Health Risks, WHO, Geneva, Switzerland, 2009. used in this context is seen to be superior in performance [3] A. Tiwari, V. Dhiman, M. A. M. Iesa, H. Alsarhan, to the other three models evaluated. Tis comparative A. Mehbodniya, and M. Shabaz, “Patient behavioral analysis with smart healthcare and IoT,” Behavioural Neurology, analysis narrative, especially the use of NPV, proves that vol. 2021, Article ID 4028761, 9 pages, 2021. model accuracy prediction score evaluation linked to the [4] A. M. Patey, G. Fontaine, J. J. Francis et al., “Healthcare problem context helps determine model performance professional behaviour: health impact, prevalence of superiority. evidence-based behaviours, correlates and interventions,” Psychology and Health, pp. 1–29, 2022. 6. Conclusion and Future Work [5] M. Grissinger, “Disrespectful behavior in health care: its impact, why it arises and persists, and how to address it—Part Tis research article has demonstrated how both biographic 2,” P and T, vol. 42, pp. 74–77, 2017. and behavioral metrics can be used to predict disease [6] E. Siegel, Predictive Analytics: Te Power to Predict Who Will treatment default without biomedical measurements. It has Click, Buy, Lie, or Die, Revised and Updated, John Wiley & also been determined (based on the collected data) that Sons, Hoboken, NJ, USA, 2015. gender biases do not afect hypertension treatment default [7] L. Harris, V. K. Lee, E. H. Tompson, and R. Kranton, risks, but other contextual variables such as age, comor- “Exploring the generalization process from past behavior to bidities, queuing time, availability of a clinician, and patients predicting future behavior,” Journal of Behavioral Decision on a controlled diet can afect treatment default outcomes. Making, vol. 29, no. 4, pp. 419–436, 2016. Finally, the superiority of the logistic regression algorithm to [8] L. H. A. Salazar, V. R. Q. Leithardt, W. D. Parreira et al., “Application of machine learning techniques to predict predict disease treatment default in hypertensive patients a patient ’ s no-show in the healthcare sector,” Future Internet, has been established. Further work in this area is to in- vol. 14, pp. 1–21, 2022. vestigate time complexities of the algorithms used to de- ´ ´ [9] O. Cordon, F. Herrera, J. De La Montaña, A. M. Sanchez, and termine efcient and efective machine learning models P. Villar, “A prediction system for cardiovascularity diseases among the selected variables (Table 3). using genetic fuzzy rule-based systems,” Advances in Artifcial Intelligence — IBERAMIA 2002, vol. 2527, pp. 381–391, 2002. Data Availability [10] I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and I. Chouvarda, “Machine learning and data Te dataset used for analysis can be available from the mining methods in diabetes research,” Computational and corresponding author upon request. Structural Biotechnology Journal, vol. 15, pp. 104–116, 2017. [11] A. J. Barda, C. M. Horvat, and H. Hochheiser, “A qualitative Disclosure research framework for the design of user-centered displays of explanations for machine learning model predictions in Tis publication is an extract from an academic research that healthcare,” NCHHSTP Social Determinants of Health, vol. 20, is self-funded by the student. pp. 9–12, 2019. Advances in Public Health 11 [12] NCHHSTP, “Frequently asked questions | social determinants of health,” 2022, https://www.cdc.gov/nchhstp/ socialdeterminants/faq.html. [13] M. Maniruzzaman, M. J. Rahman, B. Ahammed, and M. M. Abedin, “Classifcation and prediction of diabetes disease using machine learning paradigm,” Health In- formation Science and Systems, vol. 8, pp. 7–14, 2020. [14] R. Krishnamoorthi, S. Joshi, H. Z. Almarzouki et al., “A novel diabetes healthcare disease prediction framework using ma- chine learning techniques,” Journal of Healthcare Engineering, vol. 2022, Article ID 1684017, 10 pages, 2022. [15] K. Dwivedi, H. O. Sharan, and V. Vishwakarma, “Analysis of decision tree for diabetes prediction 4,” International Journal of Engineering and Technical Research, vol. 9, pp. 3–6, 2019. [16] C. A. Salkar, “A detailed analysis on kidney and heart disease prediction using machine learning,” Journal of Computing and Natural Science, vol. 1, pp. 9–14, 2021. [17] R. Spencer, F. Tabtah, N. Abdelhamid, and M. Tompson, “Exploring feature selection and classifcation methods for predicting heart disease,” Digital Health, vol. 6, Article ID 205520762091477, 2020. [18] N. Fazakis, O. Kocsis, E. Dritsas, S. Alexiou, N. Fakotakis, and K. Moustakas, “Machine learning tools for long-term type 2 diabetes risk prediction,” IEEE Access, vol. 9, pp. 103737– 103757, 2021. [19] M. Durairaj and V. Revathi, “Prediction of heart disease using Back propagation MLP algorithm,” International Journal of Scientifc & Technology Research, vol. 4, pp. 235–239, 2015. [20] C. Fiarni, E. M. Sipayung, and S. Maemunah, “Analysis and prediction of diabetes complication disease using data mining algorithm,” Procedia Computer Science, vol. 161, pp. 449–457, [21] S. DuBrava, J. Mardekian, A. Sadosky et al., “Using random forest models to identify correlates of a diabetic peripheral neuropathy diagnosis from electronic health record data,” Pain Medicine, vol. 18, no. 1, pp. 107–115, 2017. [22] M. O. Pfueger, I. Franke, M. Graf, and H. Hachtel, “Pre- dicting general criminal recidivism in mentally disordered ofenders using a random forest approach,” BMC Psychiatry, vol. 15, pp. 62–10, 2015. [23] G. Parthiban, “Applying machine learning methods in di- agnosing heart disease for diabetic patients,” International Journal of Applied Information Systems, vol. 3, no. 7, pp. 25–30, 2012. [24] X. Tian, Y. Chong, Y. Huang et al., “Using machine learning algorithms to predict hepatitis B surface antigen seroclear- ance,” Computational and Mathematical Methods in Medi- cine, vol. 2019, Article ID 6915850, 7 pages, 2019. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Advances in Public Health Hindawi Publishing Corporation

Machine Learning Modeling of Disease Treatment Default: A Comparative Analysis of Classification Models

Loading next page...
 
/lp/hindawi-publishing-corporation/machine-learning-modeling-of-disease-treatment-default-a-comparative-1i0rYQRNGJ

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Hindawi Publishing Corporation
ISSN
2356-6868
eISSN
2314-7784
DOI
10.1155/2023/4168770
Publisher site
See Article on Publisher Site

Abstract

Hindawi Advances in Public Health Volume 2023, Article ID 4168770, 11 pages https://doi.org/10.1155/2023/4168770 Research Article Machine Learning Modeling of Disease Treatment Default: A Comparative Analysis of Classification Models Michael Owusu-Adjei , James Ben Hayfron-Acquah , Frimpong Twum , and Gaddafi Abdul-Salaam Department of Computer Science, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana Correspondence should be addressed to Michael Owusu-Adjei; mowusuadjei@st.knust.edu.gh Received 14 November 2022; Revised 21 December 2022; Accepted 23 December 2022; Published 3 January 2023 Academic Editor: Chandrabose Selvaraj Copyright © 2023 Michael Owusu-Adjei et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Generally, treatment default of diseases by patients is regarded as the biggest threat to favourable disease treatment outcomes. It is seen as the reason for the resurgence of infectious diseases including tuberculosis in some developing countries. Sadly, its occurrence in chronic disease management is associated with high morbidity and mortality rates. Many reasons have been adduced for this phenomenon. Exploration of treatment default using biographic and behavioral metrics collected from patients and healthcare providers remains a challenge. Te focus on contextual nonbiomedical measurements using a supervised machine learning modeling technique is aimed at creating an understanding of the reasons why treatment default occurs, including identifying important contextual parameters that contribute to treatment default. Te predicted accuracy scores of four supervised machine learning algorithms, namely, gradient boosting, logistic regression, random forest, and support vector machine were 0.87, 0.90, 0.81, and 0.77, respectively. Additionally, performance indicators such as the positive predicted value score for the four models ranged between 98.72%–98.87%, and the negative predicted values of gradient boosting, logistic regression, random forest, and support vector machine were 50%, 75%, 22.22%, and 50%, respectively. Logistic regression appears to have the highest negative-predicted value score of 75%, with the smallest error margin of 25% and the highest accuracy score of 0.90, and the random forest had the lowest negative predicted value score of 22.22%, registering the highest error margin of 77.78%. By performing a chi-square correlation statistic test of variable independence, this study suggests that age, presence of comorbidities, concern for long queuing/waiting time at treatment facilities, availability of qualifed clinicians, and the patient’s nutritional state whether on a controlled diet or not are likely to afect their adherence to disease treatment and could result in an increased risk of default. the leading cause of deaths according to the report in [1] 1. Introduction among the population in 2017 and ranked third as the Statistical data compiled [1] from all the regional and district biggest cause of patient admission in hospitals countrywide. hospitals except the two leading teaching hospitals (Kor- Tese statistical descriptions are also confrmed in [2], which le-Bu Teaching Hospital and Komfo Anokye Teaching attributes 13 percent of total global deaths to the incidence of Hospital) in Ghana show a depressing outlook on the in- hypertension and estimates them to be the largest risk factor cidence of hypertension and other cardiovascular diseases for deaths globally. that has resulted in increase patient mortality and morbidity Cardiovascular diseases including hypertension are rates with devastating consequence. Tis has been described chronic diseases that require long-term treatment man- as a consequence of patient nonadherence/default or non- agement. Recent reports healthcare personnel [1, 2] con- compliance to prescribed treatment appointments. Hyper- frming increasing mortality rates among sufering patients tension and associated cardiovascular diseases were listed as require a critical look at treatment adherence together with 2 Advances in Public Health therefore explores contextual biographic and behavioral default risks by considering both biographic and contextual behavioral variables among patients and healthcare givers. traits of both patients and healthcare providers towards disease treatment default using empirical data from a real Te importance of behavioral analysis in healthcare is underscored by the authors of [3], whose work afrms that context with reliance on nonbiomedical variable metrics. patient behaviors are important considerations for disease treatment including the determination of disease causality. 2. Related Work Further studies to determine the impact of behavior on patients stipulate that performance behavior of patients is A comparative analysis of diferent machine learning aimed at preventing disease occurrence, detecting onset of modeling techniques within the healthcare delivery system diseases, and improving disease treatment outcomes, but has been examined in various studies and related research admittedly, the anticipated outcomes are also infuenced by works. Tis section discusses prediction accuracy scores of the behavior of healthcare professionals [4]. Teir defnition machine learning techniques used within the context of of behavior by health care professionals emphasizes on healthcare practice to highlight the signifcance of machine patient needs. behavior that takes into account the needs of learning types in classifcation-based problem domains with patients [4]. Additionally, studies regarding healthcare biomedical measurements. A study [13] to predic biabetes provider behavior and its impact on patients, healthcare disease with biomedical and biographic variables such as age, working environments, colleague workers, etc., are em- education, systolic and diastolic blood pressure, body mass phasized by the author of [5], with a tacit admission that index, direct and cholesterol using learning algorithms there are negative outcomes of disrespectful behavior, some namely; Naive Bayes, Decision Tree, Adaboost and Random of which cause recipients to experience fear and feel isolated, Forest showed auc score of 0.95 for Logistic regression and including many others. Te impact of negative consequences Random forest. Tis study therefore concludes that the of these behavioral traits by healthcare providers on patient combination of logistic regression with the random forest treatment outcomes shows that disease treatment default by showed superior performance and could be used in pre- patients can be established in context. Tis includes the dicting diabetic patients. consideration of contextual variables about both parties Similar predictive studies [14] with the disease pre- (healthcare providers and patients). Applying predictive diction framework using machine learning techniques for algorithms with the capability of showing hidden patterns of diabetes healthcare with a combination of biographic and information can assist in this endeavor. biomedical data such as age, blood pressure, glucose, and Predictive algorithms have become powerful in- insulin body mass index with algorithms such as K-nearest struments for businesses (big and small) as a competitive neighbor (KNN), support vector machine (SVM), logistic tool. In mortgage fnancial decisions, including assessing regression, and random forest concluded that logistic re- payment behavior of homeowners [6], predictive analytics gression had superior performance with a receiver operating have been employed to study and understand the behavior of characteristic curve (ROC) score of 86%. Similarly, an homeowners. Te other use of predictive algorithms is analysis of decision trees for diabetes prediction using both providing insights into future human behavior based on biomedical and biographic variables such as gender, plasma, present or available information [7]. In the healthcare in- insulin, glucose, body mass index, and blood pressure with dustry, predictive algorithms have been used in the area of decision tree algorithms such as leaf area density (LAD), classifcation problems such as predicting the patient’s na¨ıve Bayes, and genetic j48 also showed predicted accuracy nonattendance to a scheduled appointment clinic [8]. scores of 88%, 92%, and 95.8%, respectively [15]. Further use of predictive algorithms is associated with early Further studies on detailed analysis of kidney and heart detection and diagnoses of diseases for preventive medical disease prediction with machine learning using two distinct care due to increased medical treatment costs together with datasets consisting of both biomedical and biographic increasing morbidity and mortality rates [9]. Advances in variables such as age, blood pressure, specifc gravity, sugar, technology, coupled with increased production of data albumin, red blood cells, puss cells, pus cell clumps, bacteria, within healthcare systems, have heightened research in- blood glucose level, blood urea, sex, education, current terests in healthcare applications for knowledge discovery smoking status, cigarettes smoked per day, BP medicines, and insights into patterns of change [10]. Predictive machine and prevalent smoking also showed accuracy scores of 100% learning algorithms have therefore become useful in pro- for chronic kidney disease and 85% for heart disease. Ma- viding user-centered explanations of the factors that lead to chine learning algorithms such as logistic regression, K- an increased risk of adverse outcomes in healthcare nearest neighbor, na¨ıve Bayes, support vector machine, and settings [11]. random forest [16] also produced accuracy scores of 100% Tere have been several predictive studies of disease for chronic kidney disease and 85% for heart disease based diagnosis and detection on hypertension and other car- on a support value of 750 for all the algorithms. diovascular diseases predominantly using contextual vari- Further studies for heart disease classifcation predicted ables such as biomedical and biographic metrics. Behavior is by the authors of [17], which aim to explore feature selec- mentioned among the fve broad determinants of health tions with a combination of chi-squared feature selection [12]. However, studies requiring the use of behavior together methods and the Bayes net algorithm, achieved an accuracy with either biographic or biomedical metrics for disease score of 85.00% using both biomedical and biographic treatment in this regard have been limited. Tis research variables such as age, sex, measuring values of fasting blood Advances in Public Health 3 prediction of criminal recidivism with a sampled population sugar, resting blood pressure, serum cholesterol, resting electrocardiographic reports, maximum heart rates, and of 365. Te other distinct feature of the observed related works is the use of sourced datasets from studies conducted number of major vessels colored by fuoroscopy. A personalized modeling and prediction approach with elsewhere. Te uniqueness of this study is the use of em- internet-connected smart devices for data generation [18] pirical data collected from a real-world context by assessing used weighted voting logistic regression and random forest entities involved in disease treatment management for the machine learning techniques for type 2 diabetes prediction prediction of treatment default. and produced a model accuracy score of 0.884 with both biomedical and biographic metrics such as age, gender, body 3. Research Contribution mass index, cholesterol level, marital status, employment One of the accomplishments in this research work is the use status, and income level. of biographic and behavioral metrics of both patients and Further predictive studies on the presence or absence of healthcare providers to predict disease treatment default of heart disease using techniques such as the backpropagation patients sufering from hypertension with and without multilayer perceptron [19] showed an improved model comorbidities. performance accuracy score of 96.30% with features such as age, sex, fasting blood sugar, and resting blood sugar. Using several machine learning techniques for supervised and 3.1. Research Hypothesis unsupervised learning in diabetes research predictions [10], it was concluded that the use of support vector machines 3.1.1. Null Hypothesis H . No relationship exists between proved to be the most successful while using clinical datasets disease treatment default and patient gender. with features extracted from demographics, diagnoses, disease comorbidities and symptoms, medications, labora- 3.1.2. Alternative Hypothesis H . Te relationship exists tory measurements, and other procedures. Furthermore, between disease treatment default and patient gender. a predictive analysis of diabetic complications [20] using A comparative analysis of four machine learning models, naive Bayes tree, C4.5 decision tree-based classifcation, and namely, logistic regression, gradient boosting classifer, k-means clustering techniques with features such as age, support vector machine (SVM), and random forest classifer gender, body mass index, family history of diabetes, blood is examined to determine the best predicted values such as pressure, duration of onset, and blood glucose level showed the true positive rate (TPR), false positive rate (FPR), an overall model accuracy of 68%. A similar study to identify positive predicted value (PPV), and negative predicted value correlated variables such as demographic, clinical, and (NPV) of each algorithm for performance evaluation. healthcare resource utilization variables for the diagnosis of Tis is accomplished in the listed steps as follows: diabetic peripheral neuropathy using the random forest algorithm/technique achieved an ROC model performance Step 1. Variable independence using the chi-square correlation statistic is determined to establish the of 0.824 and a model accuracy of 89.6% with a 95% con- fdence interval [21]. A similar study with variables such as gender relationship between the output class. the number of prior convictions, age, type of index ofence, Step 2. Optimal threshold performance is determined diversity of criminal history, and substance abuse to predict with the application of threshold optimization using general criminal recidivism in mentally disordered ofenders the area under the receiver operating using the random forest technique also produced an AUC characteristic curve. score of 0.90 [22]. In diagnosing heart disease for diabetic Step 3. Te predicted accuracy scores of the true patients using variables such as age, sex, blood pressure, and positive rate (TPR), false positive rate (FPR), positive blood sugar in predicting the chances of a diabetic patient predicted value (PPV), and negative predicted value developing heart disease, naıve Bayes and support vector (NPV) have been demonstrated using the four models. machines showed signifcant prediction accuracy [23]. Al- Step 4. Te predicted scores such as the true positive ternatively, a study [24] to identify the optimal model that rate (TPR), false positive rate (FPR), positive predicted predicts HBsAg seroclearance of patients sufering from value (PPV), and negative predicted value (NPV) to chronic hepatitis B with selected variables such as age, estimate how well these models would perform in gender, family history, body mass index, and drinking subsequent predictions have been comparatively history used four machine learning algorithms, namely, evaluated. extreme gradient boosting (XGBoost), random forest (RF), decision tree (DT), and logistic regression (LR) and iden- Comparative performance evaluation determines which tifed XGBoost to show superior model performance in machine learning algorithm performs better given the re- identifying predictive variable importance with an AUC quired dataset format. score of 89.1%. In the studies discussed above and many others, refer- 4. Materials and Methods ences are made of predicted accuracy scores (AUC) of varying percentages from various machine learning algo- Te electronic medical record dataset of patients was ob- rithms using biomedical and biographic variables. Te use of tained through an institutional request for permission as per biographic and behavioral metrics is noted in [22] for the the request reference number DCS/S.1/VOL.1 from Kwahu 4 Advances in Public Health Government Hospital (a district healthcare facility in the Age Bracket Frequency Distribution eastern region of Ghana). A total of 5,333 patients were identifed as sufering from hypertension, and some of them were also sufering from both hypertension and other dis- eases (comorbidities). As a measure to protect the privacy and confdentiality of patients involved by the institution, individual patient names and location were omitted from the records. Te features used were selected from clinical notes, and biographic data were obtained. Te classifcation-based machine learning algorithms used are gradient boosting classifer, logistic regression, support vector machine, and random forest. All software developments were performed using Python version 3.10.5 and its packages for all data processing steps. Te results obtained from data modeling are presented in the sub- sequent section. Age Bracket Figure 1: Age bracket distribution. 4.1. Results. Te sample size was 5333 patients which been illustrated for emphasis on collected behavioral and consisted of 4,312 females constituting 80.86% of the total biographic variables used. Certain processes had been sample population and 1,021 males constituting 19.14% of highlighted for emphasis about the behavior of health care the total sample population. Te statistical description of age professionals especially on delivery of service. bracket and distribution density among those sampled is Figure 7 shows a fowchart diagram illustrating various illustrated in Figures 1 and 2, respectively. stages and processes relevant to building machine learning Te highest age bracket distribution from the sampled algorithms for prediction. It includes subprocesses, decision data according to Figure 1 is found among the ages between points, and task evaluation points, as well as arrow pointers 66 and 76, and the least age bracket distribution is found for task routes. Tere are processes that depend on other between 110 and 120. Age distribution density is higher processes to begin, creating a dependent rule. In Figure 7, the between ages 44 and 54 up to 88 and 98. Gender distribution dependent process can be seen in the data processing stage is displayed in Figure 3. where many subprocesses are defned before the next task, Basic categories of the sampled data were patients di- which is splitting of the dataset. At certain stages of the agnosed with hypertension only and those who had been process, decisions to determine the next course of action are diagnosed with hypertension and other cardiovascular made. In machine learning prediction processes, there is diseases such as diabetes. Figure 4 shows the frequency beginning and ending of all processes. A green oval shows distribution of patients with hypertension only (indicated by beginning of processes, whereas a red oval shows ending of blue color) and hypertension with comorbidities (indicated processes. by red color). Te female patient population accounted for In order to establish variable relationships for the output the highest number of patients with hypertension only and variable, the chi-square correlation statistic test was per- hypertension with comorbidities, as illustrated in Figure 4, formed on all the input variables. Tose with statistical against their male counterparts. signifcance are selected and described as follows. In Figure 3, the frequency distribution among gender indicates a higher proportion of females with fewer males. Representation of output class distribution in Figure 5 4.1.1. Chi-Square Correlation Statistic Score shows that negative class patients identifed as nondefaulters were far more than those identifed as defaulters to treatment On controlled diet: default. p value � 4.577598097458345e − 32 For the new patient, the onset of treatment begins with Chi-square statistic value � 153.02396213387735 signs and symptoms of discomfort, whereas for an existing Dependent (reject H0) patient undergoing treatment, it begins with a visit as a result of a scheduled appointment or realization of signs and Availability of a physician: symptoms of discomfort. Figure 6 shows the disease treat- p value � 2.8205192634562076e − 55 ment process for both new and existing patients sufering Chi-square statistic value � 260.9682907753015 from hypertension and other forms of cardiovascular dis- Dependent (reject H0) ease. In Figure 6, the rectangles indicate processes and subprocesses, the arrows show task routes, the rhombus Concern for long waiting/queuing time: show decision boundaries, and the oval represents the start p value � 1.239782911843141e − 25 or end of processes. Deepened processes with texts show Chi-square statistic value � 122.96922267902593 data collection points, but for this research work, no measuring biomedical item was collected. Tese points have Dependent (reject H0) Frequency 11-21 22-32 33-43 44-54 55-65 66-76 77-87 88-98 99-109 110-120 121-131 Advances in Public Health 5 Age Distribution Density 0.035 0.030 0.025 0.020 0.015 0.010 0.005 0.000 20 40 60 80 100 120 age Figure 2: Age distribution density histogram. Frequency Distribution by Gender female male Gender Figure 3: Frequency distribution by gender. Comorbidity distribution based on gender female male gender cmbd yes no Figure 4: Comorbidity distribution among sampled gender. Hypertension with comorbidities: Dependent (reject H0) p value � 2.4832105751483715e − 54 Patient’s age: Chi-square statistic value � 256.5842340588116 p value � 4.4969165821531835e − 05 Density count Frequency 6 Advances in Public Health Treatment Outcome Class Distribution 1 0 Output Class Figure 5: Treatment output class distribution. Disease Treatment Process Flowchart Existing patient Start yes no review/schedule new patient/signs & personal records appointment symptoms medical laboratory tests sub processes diagnostic investigation radiological investigations vital signs check vital signs/physical assigned to a clinician investigation report review review assessment start/continue yes confirm diagnosis/new End treatment patient no Figure 6: Flowchart diagram of the disease treatment process. Chi-square statistic value � 259.20561138258796 hypertension with comorbidities, and the patient’s age are estimated to infuence the dependent variable as null hy- Dependent (reject H0) pothesis is rejected. Gender: A comparative display of the model confusion matrices p value � 0.4911070922909865 obtained is shown. Figure 8 shows a collection of the various confusion Chi-square statistic value � 3.4137909069569465 matrices obtained from the four machine learning models. Independent (fail to reject H0) Each matric has four sections indicating specifc values. Te chi-square correlation statistic test was performed Description of colored section in the confused matrix named on (selected for its signifcance) input variables which are from the top left section (True positive-TP), yellow shaded described above. It shows p values and correlation status of section on the bottom right (True negative-TN), upper right the dependent variable. Patients admitted to be on a con- corner (False negative-FN) and bottom left corner (False trolled diet, physician availability to examine patients, positive-FP). In Figure 8(a), gradient boosting correctly concern for longer waiting or queuing time at healthcare classifed 1313 patients as nondefaulters and misclassifed 15 facilities, patients sufering from both hypertension and nondefaulters as defaulters and 3 default patients as Frequency Advances in Public Health 7 Model Building Process Data collection Data processing train model fit model Start Select/classification model remove Define input/ check for null check for models prediction output variables values outliers duplicates Sub processes no yes split into compute dimensionality label data scaling training/testing accuracy check encoding datasets score quality check ok? End Evaluate/improve? hyperparameter tuning no yes Figure 7: A fowchart diagram of a model building process. Te rectangles show processes, the arrows indicate task routes, and the rhombus show decision points and circles depict the start or end of processes (circular green, start and circular red, end). Gradientboosting Confusion Matrix Logistic Regression Confusion Matrix 1200 1200 1000 1000 Defaulted 3 15 Defaulted 315 800 800 600 600 400 400 Not Defaulted 3 1313 Not Defaulted 1 1315 200 200 Defaulted Not Defaulted Defaulted Not Defaulted Predicted label Predicted label (a) (b) Random Forest Confusion Matrix Support vector Confusion Matrix 1000 1000 Defaulted 2 16 Defaulted 1 17 800 800 600 600 Not Defaulted 7 1309 Not Defaulted 1 1315 200 200 Defaulted Not Defaulted Defaulted Not Defaulted Predicted label Predicted label (c) (d) Figure 8: Model confusion matrices for (a) gradient boosting, (b) logistic regression, (c) random forest, and (d) support vector machine. nondefaulters. In Figure 8(b), logistic regression correctly as a nondefaulter. In Figure 8(c), the random forest correctly classifed 1315 patients as nondefaulters and misclassifed 15 classifed 1309 nondefault patients and misclassifed 16 nondefault patients as default patients and 1 default patient nondefault patients as defaulters and 7 default patients as True label True label True label True label 8 Advances in Public Health Table 1: Specifcity and sensitivity test scores. Model FNR (%) TNR (%) FPR (%) PPV (%) NPV (%) TPR (%) Gradient boosting 0.23 16.67 83.33 98.87 50.0 99.77 Logistic regression 0.08 16.67 83.33 98.87 75.0 99.92 Random forest 0.53 11.11 88.89 98.79 22.22 99.47 Support vector machine 0.08 5.56 94.44 98.72 50.0 99.92 Te highlighted scores tell us how much trust to put into a test result. Can a negative test result be truly trusted or not. probability of trust in a test result. Negative predicted value is the ratio of patients truly predicted as defaulters to all patients diagnosed as defaulters. It is a probability estimate for a nondefault status if described as a default patient. High percentage value means that the probability for a prediction miss is lower (minimal error). Lower percentage means a very high probability for a prediction miss. nondefaulters. Finally, in Figure 8(c), the support vector Table 2: Precision, recall, and F1 score weighted macro average machine correctly classifed 1315 nondefault patients and scores. misclassifed 17 nondefault patients as defaulters and 1 Classifcation report default patient as a nondefaulter. Model Precision Recall F1 score Te predicted probability score rates in percentages are illustrated in Table 1. Gradient boosting 0.98 0.99 0.98 Logistic regression 0.98 0.99 0.98 Specifcity and sensitivity both describe test accuracies Random forest 0.98 0.98 0.98 for the predicted output. How well a model is able to identify Support vector machine 0.98 0.99 0.98 true positives in a diagnostic test is referred to as its sen- sitivity, and conversely, a model’s measure of true negatives is termed its specifcity. Te false negative rate (FNR) is the machines at 50% each, with the random forest recording the probability that a true positive result will not be true or least percentage score. Tis means that if a predicted result of missed. logistic regression is indicated as 75% for the NPV, then In Table 1, the FNR, as recorded by the four models, is there is a 75% chance it is indeed negative. For the random 0.23% for gradient boosting, 0.08% for logistic regression, forest classifer, a 22.22% NPV is indicative of only 22.22% 0.53% for the random forest, and 0.08% for the support chance of a negative result being accurate. vector machine. Te probability of a true positive result Te true positive rate (TPR) is the probability that an being missed is much higher by the random forest classifer actual positive will test positive. It is also known as a sen- than any of the other models. Logistic regression and the sitivity test. Te TPR values for all models as recorded in support vector machine record the least probabilities of Table 1 are greater than 99%. a true positive result being missed. From the classifcation report shown in Table 2, the Te true negative rate (TNR) is the probability that a true weighted average macro precision scores for the individual negative result will be missed by the model. Te results models were above 0.98, making F1 score higher above 0.9. obtained, as shown in Table 1, show the TNR rate for various Te high weighted macroaverage F1 score is obtained for an models as follows: 16.67% for gradient boosting, 17.67% for imbalanced dataset (Figure 5) where the individual class logistic regression, 11.11% for the random forest classifer, score contribution is weighted by its size. High weighted and 5.56% for the support vector machine. Te support macroaverage scores for precision, recall, and F1 score are vector machine is seen to have the least probability, followed indicative of prediction success. by the random forest, with logistic regression and gradient Te area under the receiver operating characteristic boosting having the same score value. curve (roc_auc) describes various model performances at Te false positive rate (FPR) also referred to as “fall out” separate threshold levels. Figure 9 shows a plot of two is described as the probability of a false alarm being raised. important parameters, the false positive rate and the true Te FPR values for the models, as displayed in Table 1, positive rate. In the ROC curve, model performance can be show that the support vector machine has the highest evaluated at diferent threshold points. Using the AUC record value of 94%, followed by the random forest (88%), technique, the aggregate model performance of the models with logistic regression and gradient boosting recording can be calculated across all threshold points. Te AUC score 83.33% each. for logistic regression is 0.90, 0.77 for the support vector Te positive predicted value (PPV) is the probability that machine, 0.87 for the gradient boosting classifer, and 0.81 a positive result is truly positive (a default patient is truly for the random forest, and these values indicate individual a defaulter). In Table 1, it can be seen that all the models model prediction accuracy. Te dotted black line shown in show higher positive predicted values above 98%, that is, the Figure 9 represents the baseline model, which is indicative of probability that the prediction of a default patient will not be poor model performance. missed. Te negative predicted value (NPV) is the proba- bility that a nondefault patient will also not be missed. Actual prediction of patients classifed as nondefaulters by the four 4.2.AlgorithmPerformanceMetricDiscussion. Tis section is models according to Table 1 shows that logistic regression divided into two parts: the frst part deals with a statistical shows a higher negative predicted value score of 75%, fol- description of the sampled data and the second part de- lowed by both gradient boosting and support vector scribes actual performance metrics of the models used, Advances in Public Health 9 Receiver operating characteristic curve 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 False Positive Rate SVC (AUC = 0.77) GradientBoostingClassifier (AUC = 0.87) LogisticRegression (AUC = 0.90) RandomForestClassifier (AUC = 0.76) Figure 9: Roc_auc curve. including results presented in various fgures, graphs, and Tis is of utmost importance for predictions in medical tables. diagnosis, disease detection, and other problems in the In Figure 1, a presentation of age bracket distribution of healthcare domain. Te accuracy of a negative or positive the sampled data shows that the incidence of hypertension result being truly negative or positive is essential to avoid and its associated diseases (comorbidities) is prevalent missing diagnosis, especially in diseases that require urgent among the age bracket 66–76. Te incidence of hypertension attention. It is for this reason that a logistic regression score alone and hypertension with comorbidities among gender, of 75% is considered critically important. as shown in Figure 4, indicates that the female population Additional performance evaluations with defnitions of had the highest prevalence rate than males. false negative rates and an explanation for the score are A determination of variable relationships using Chi- obtained. However, the random forest classifer shows square correlation statistic of input variables to estimate a probability score of 0.53%, approximated as 1, which gender, age, availability of a physician, hypertension with makes it very likely to miss a true positive result. As shown in comorbidities, queue/waiting time and controlled diet Figure 5, the output class contains data imbalance; therefore, presented under sub section Chi-square correlation Statistic it is important to consider loss based on proportion and the score produced three scores namely; Ch-square statistic use of weighted macroaverages. Te weighted macro- value, p value and variable correlation state. Comparison of averages for precision, recall, and F1 score according to the p value with alpha value of 1 results in output decision; classifcation report shown in Table 2 are 98% for gradient boosting, 99% for logistic regression, 98% for the random a rejection or failure to reject the null hypothesis. Te chi-square correlation statistic of variance showed forest, and 98% for the support vector machine. no correlation between treatment default and gender, Te additional evaluation metric used for model per- thereby failing to reject H However, other variables such as formance is the roc_auc curve, as shown in Figure 9. Tis O. curve is an aggregate measure across all threshold points for age, presence of comorbidities, concern for long waiting/ queuing time, controlled diet, and availability of a physician model performance. Te AUC scores shown in Figure 9 for showed correlation with the output variable (treatment the support vector machine model are 0.77, 0.81 for the default status), and null hypothesis was rejected. random forest, 0.90 for logistic regression, and 0.87 for Generally, all the models as shown in Table 1 show gradient boosting, respectively. capacity to predict true positive values, as shown by their From the various performance metrics, as shown in scoring percentages (gradient boosting, 99.77%; logistic Table 1 and Figure 9, the logistic regression model is con- sidered the obvious model of choice for this classifcation regression, 99.92%; random forest, 99.47%; and SVM, 99.92%). Te important additional performance metric in prediction. It has shown superior performance with an AUC score of 0.90, a negative predicted value (NPV) score of 75%, Table 1 with the highlighted text shows the negative predicted values (NPVs) of all the models. Te negative and an FNR of 0.08%. predicted values in Table 1 according to individual models show that logistic regression has a higher score of 75%, 5. Discussion followed by both gradient boosting and support vector machines scoring 50% each, with the random forest al- Te stated focus of this research was to use biographic and gorithm scoring the least score of 22.22%. behavioral metrics to predict disease treatment default and Tis was to defne what negative predicted value or score to test for null hypothesis as to whether gender biases have mean for predictions by linking it with the score value of any correlation with hypertension treatment default. Te logistic regression. Logistic Regression’s prediction of neg- use of the chi-square correlation statistic of variable in- ative values will be accurate at 75% rate. dependence as shown proves no dependency of patient True Positive Rate 10 Advances in Public Health Table 3: Contextual biographic and behavioral variables. Conflicts of Interest Biographic and behavioral metrics Te authors declare that there are no conficts of interest Variable Description regarding the publication of this paper. Age Patient’s age Gender Gender of the patient Authors’ Contributions Advreact Complaints of adverse drug reaction Herbal Use of herbal supplements Michael Owusu-Adjei conceptualized the study. Michael Missdose Default in medication Owusu-Adjei, Dr. Twum Frimpong, and Dr. Gaddaf Abdul- Smoking Smoking status Salaam designed the methodology. Prof. James Ben Hay- Exercise Any physical activity fron-Acquah supervised the study. Alcohol Use of alcohol Diet Whether on controlled diet Staf attitude Approval of healthcare provider attitude Acknowledgments Service_stf Service rating Attd_physician Availability of a clinician Te authors continue to acknowledge the support and co- Queue_time Concern for waiting time operation of management and staf of the Kwahu Gov- Cmbd Presence of comorbidities ernment Hospital, especially the Medical Director Dr. Defaulter_status Treatment status Kobina Awotwe Wiredu and the supervisory team led by Prof. James Ben Hayfron-Acquah, Dr. Twum Frimpong, and gender on treatment default but rejects the null hypothesis Dr. Gaddaf Abdul-Salaam. Tis publication is an extract with respect to other contextual variables. Further model from an academic research that is self-funded by the student. evaluations to determine superior model performance in Table 1 and Figure 9 also identify logistic regression as References having a higher NPV, as shown in Table 1, and SVM has the least probability score for an FNR (0.08%) and the [1] Ghana Health Service, “Te health sector in Ghana: facts and highest AUC score of 0.90, as shown in Figure 9. Based on fgures 2018,” Ministry Health Ghana, pp. 1–50, 2018. the above statistical values, the logistic regression model [2] WHO, Global Health Risks, WHO, Geneva, Switzerland, 2009. used in this context is seen to be superior in performance [3] A. Tiwari, V. Dhiman, M. A. M. Iesa, H. Alsarhan, to the other three models evaluated. Tis comparative A. Mehbodniya, and M. Shabaz, “Patient behavioral analysis with smart healthcare and IoT,” Behavioural Neurology, analysis narrative, especially the use of NPV, proves that vol. 2021, Article ID 4028761, 9 pages, 2021. model accuracy prediction score evaluation linked to the [4] A. M. Patey, G. Fontaine, J. J. Francis et al., “Healthcare problem context helps determine model performance professional behaviour: health impact, prevalence of superiority. evidence-based behaviours, correlates and interventions,” Psychology and Health, pp. 1–29, 2022. 6. Conclusion and Future Work [5] M. Grissinger, “Disrespectful behavior in health care: its impact, why it arises and persists, and how to address it—Part Tis research article has demonstrated how both biographic 2,” P and T, vol. 42, pp. 74–77, 2017. and behavioral metrics can be used to predict disease [6] E. Siegel, Predictive Analytics: Te Power to Predict Who Will treatment default without biomedical measurements. It has Click, Buy, Lie, or Die, Revised and Updated, John Wiley & also been determined (based on the collected data) that Sons, Hoboken, NJ, USA, 2015. gender biases do not afect hypertension treatment default [7] L. Harris, V. K. Lee, E. H. Tompson, and R. Kranton, risks, but other contextual variables such as age, comor- “Exploring the generalization process from past behavior to bidities, queuing time, availability of a clinician, and patients predicting future behavior,” Journal of Behavioral Decision on a controlled diet can afect treatment default outcomes. Making, vol. 29, no. 4, pp. 419–436, 2016. Finally, the superiority of the logistic regression algorithm to [8] L. H. A. Salazar, V. R. Q. Leithardt, W. D. Parreira et al., “Application of machine learning techniques to predict predict disease treatment default in hypertensive patients a patient ’ s no-show in the healthcare sector,” Future Internet, has been established. Further work in this area is to in- vol. 14, pp. 1–21, 2022. vestigate time complexities of the algorithms used to de- ´ ´ [9] O. Cordon, F. Herrera, J. De La Montaña, A. M. Sanchez, and termine efcient and efective machine learning models P. Villar, “A prediction system for cardiovascularity diseases among the selected variables (Table 3). using genetic fuzzy rule-based systems,” Advances in Artifcial Intelligence — IBERAMIA 2002, vol. 2527, pp. 381–391, 2002. Data Availability [10] I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and I. Chouvarda, “Machine learning and data Te dataset used for analysis can be available from the mining methods in diabetes research,” Computational and corresponding author upon request. Structural Biotechnology Journal, vol. 15, pp. 104–116, 2017. [11] A. J. Barda, C. M. Horvat, and H. Hochheiser, “A qualitative Disclosure research framework for the design of user-centered displays of explanations for machine learning model predictions in Tis publication is an extract from an academic research that healthcare,” NCHHSTP Social Determinants of Health, vol. 20, is self-funded by the student. pp. 9–12, 2019. Advances in Public Health 11 [12] NCHHSTP, “Frequently asked questions | social determinants of health,” 2022, https://www.cdc.gov/nchhstp/ socialdeterminants/faq.html. [13] M. Maniruzzaman, M. J. Rahman, B. Ahammed, and M. M. Abedin, “Classifcation and prediction of diabetes disease using machine learning paradigm,” Health In- formation Science and Systems, vol. 8, pp. 7–14, 2020. [14] R. Krishnamoorthi, S. Joshi, H. Z. Almarzouki et al., “A novel diabetes healthcare disease prediction framework using ma- chine learning techniques,” Journal of Healthcare Engineering, vol. 2022, Article ID 1684017, 10 pages, 2022. [15] K. Dwivedi, H. O. Sharan, and V. Vishwakarma, “Analysis of decision tree for diabetes prediction 4,” International Journal of Engineering and Technical Research, vol. 9, pp. 3–6, 2019. [16] C. A. Salkar, “A detailed analysis on kidney and heart disease prediction using machine learning,” Journal of Computing and Natural Science, vol. 1, pp. 9–14, 2021. [17] R. Spencer, F. Tabtah, N. Abdelhamid, and M. Tompson, “Exploring feature selection and classifcation methods for predicting heart disease,” Digital Health, vol. 6, Article ID 205520762091477, 2020. [18] N. Fazakis, O. Kocsis, E. Dritsas, S. Alexiou, N. Fakotakis, and K. Moustakas, “Machine learning tools for long-term type 2 diabetes risk prediction,” IEEE Access, vol. 9, pp. 103737– 103757, 2021. [19] M. Durairaj and V. Revathi, “Prediction of heart disease using Back propagation MLP algorithm,” International Journal of Scientifc & Technology Research, vol. 4, pp. 235–239, 2015. [20] C. Fiarni, E. M. Sipayung, and S. Maemunah, “Analysis and prediction of diabetes complication disease using data mining algorithm,” Procedia Computer Science, vol. 161, pp. 449–457, [21] S. DuBrava, J. Mardekian, A. Sadosky et al., “Using random forest models to identify correlates of a diabetic peripheral neuropathy diagnosis from electronic health record data,” Pain Medicine, vol. 18, no. 1, pp. 107–115, 2017. [22] M. O. Pfueger, I. Franke, M. Graf, and H. Hachtel, “Pre- dicting general criminal recidivism in mentally disordered ofenders using a random forest approach,” BMC Psychiatry, vol. 15, pp. 62–10, 2015. [23] G. Parthiban, “Applying machine learning methods in di- agnosing heart disease for diabetic patients,” International Journal of Applied Information Systems, vol. 3, no. 7, pp. 25–30, 2012. [24] X. Tian, Y. Chong, Y. Huang et al., “Using machine learning algorithms to predict hepatitis B surface antigen seroclear- ance,” Computational and Mathematical Methods in Medi- cine, vol. 2019, Article ID 6915850, 7 pages, 2019.

Journal

Advances in Public HealthHindawi Publishing Corporation

Published: Jan 3, 2023

There are no references for this article.