Evaluating the Heterogeneity of Logistic Regression Models to Predict Coronary Artery Disease Status

Description
Coronary artery disease (CAD) is one of the most diagnosed heart diseases globally, affecting about 5% of adults over the age of twenty[1]. Lifestyle changes can positively impact risk of developing CAD and are especially important for individuals with high

Coronary artery disease (CAD) is one of the most diagnosed heart diseases globally, affecting about 5% of adults over the age of twenty[1]. Lifestyle changes can positively impact risk of developing CAD and are especially important for individuals with high genetic risk [1]. In this study, we sought to predict the likelihood of developing CAD using genetic, demographic, and clinical variables. Leveraging genetic and clinical data from the UK Biobank on over 500,000 individuals, we classified and separated 500 genetically similar individuals to a target individual from another 500 genetically dissimilar individuals. This process was repeated for 10 target individuals as a proof-of-concept. Then, CAD-related variables were used and these include age, relevant clinical factors, and polygenic risk score to train models for predicting CAD status for the 500 genetically similar and 500 genetically dissimilar groups, and determine which group predicts the likelihood of CAD more accurately. To compute genetic similarity to the target individuals we used the Mahalanobis distance. To reduce the heterogeneity between sexes and races, the studies were restricted to British male Caucasians. The models using the more similar individuals demonstrated better predictive performance. The area under the receiver operating characteristic curve (AUC) was found to be significantly higher for the ‘similar’ rather than the ’dissimilar’ groups, indicating better predictive capability (AUC=0.67 vs. 0.65, respectively; p-value<0.05). These findings support the potential of precision prevention strategies, since one should build predictive models of disease for any one target individual from more similar individuals to that target even within an otherwise homogenous group of individuals (e.g., British Caucasians). Although intuitive, such practices are not done routinely. Further validation and exploration of additional predictors are warranted to enhance the predictive accuracy and applicability of the model.
Date Created
2024-05
Agent