In:
International Journal of Epidemiology, Oxford University Press (OUP), Vol. 50, No. Supplement_1 ( 2021-09-01)
Abstract:
Major barriers exist in incorporating artificial intelligence into epidemiology, particularly in data interpretation. Thus, we examined the application of highly interpretable machine-learning methods— Random Forest (RF) and Sparse Logistic Regression (SLR)— to a large-scale health check-up dataset, examining the advantages of creating prediction models using these. Methods This study involved 392,791 participants who underwent healthcare checkups in Japan from 1999 to 2018. Participants who received diabetes treatment, or had an HbA1c level of 6.5% or higher, were excluded. The objective variable examined was type 2 diabetes onset over five years. Each prediction model was created using 26 health status items over three consecutive years. We examined three analytical methods to compare their predictive powers: RF, SLR, and a multivariate stepwise logistic regression (MSLR) as a conventional method. Variable Importance (VI) was calculated in the RF analysis, with Standard Regression Coefficients (SRC) being calculated in the SLR and MSLR analyses. Results Predictive accuracy is highest in the SLR model (AUC:0.955), followed by the RF model (AUC:0.949), and then the MSLR model (AUC:0.939). The RF model measures blood glucose, HbA1c, height, red blood cells, and aspartate transaminase with a higher predictive power. In the SLR model, HbA1c, blood glucose, systolic blood pressure, HDL-Cholesterol, and age have higher SRC. Conclusions Machine learning techniques enable more accurate diabetes risk predictions than existing methods and suggest new ways of identifying associated predictors. Key messages Applying machine-learning methods to health check-up data achieves a high accuracy in predicting type 2 diabetes while maintaining data interpretability.
Type of Medium:
Online Resource
ISSN:
0300-5771
,
1464-3685
DOI:
10.1093/ije/dyab168.515
Language:
English
Publisher:
Oxford University Press (OUP)
Publication Date:
2021
detail.hit.zdb_id:
1494592-7
Bookmarklink