Abstract
Aims
To evaluate the fairness of three established risk prediction models belonging to the QPrediction family: QRISK3, QDiabetes, and QStroke, which estimate 10-year risks for cardiovascular disease, type 2 diabetes, and stroke, respectively.
Methods
We used data from the UK Biobank, a large-scale prospective cohort study comprising 502,131 participants. To assess calibration, the predicted risks were compared to the observed cumulative incidence rates and inverse probability weighted survival Brier scores were calculated as a summary metric of model calibration. To assess discrimination, we calculated inverse probability weighted overall and group-specific true positive, true negative, false positive, and false negative rates (TPR, TNR, FPR, FNR). We compared the rates between subgroups of demographics and considered prediction parity when values were close to each other to a prespecified degree.
Results
After applying model-specific exclusion criteria, we analyzed data from 405,855 participants for QRISK3, 458,700 for QDiabetes, and 490,471 for QStroke. Overall, all models showed systematic overprediction of risks for the UK Biobank validation cohorts, with the degree of overprediction varying between the models. Prediction parity was not reached in FNR and TNR in terms of ethnicity, education level, average household income before taxation, Townsend deprivation score. For immigration status, there were no major differences in the discrimination metrics, and prediction parity was achieved. For QDiabetes, the trends were similar, apart from ethnicity, where the opposite was observed as for QRISK3.
Conclusions
Inclusion of ethnicity and deprivation in the models may have contributed to more consistent calibration across subgroups. Variables were measured only at baseline, while many social and economic conditions can shift over time. Although including social determinants of health can improve predictive performance, the models are typically deployed in a clinical setting to guide individual-level decision making, which can inadvertently shift responsibility for health outcomes onto patients.
To evaluate the fairness of three established risk prediction models belonging to the QPrediction family: QRISK3, QDiabetes, and QStroke, which estimate 10-year risks for cardiovascular disease, type 2 diabetes, and stroke, respectively.
Methods
We used data from the UK Biobank, a large-scale prospective cohort study comprising 502,131 participants. To assess calibration, the predicted risks were compared to the observed cumulative incidence rates and inverse probability weighted survival Brier scores were calculated as a summary metric of model calibration. To assess discrimination, we calculated inverse probability weighted overall and group-specific true positive, true negative, false positive, and false negative rates (TPR, TNR, FPR, FNR). We compared the rates between subgroups of demographics and considered prediction parity when values were close to each other to a prespecified degree.
Results
After applying model-specific exclusion criteria, we analyzed data from 405,855 participants for QRISK3, 458,700 for QDiabetes, and 490,471 for QStroke. Overall, all models showed systematic overprediction of risks for the UK Biobank validation cohorts, with the degree of overprediction varying between the models. Prediction parity was not reached in FNR and TNR in terms of ethnicity, education level, average household income before taxation, Townsend deprivation score. For immigration status, there were no major differences in the discrimination metrics, and prediction parity was achieved. For QDiabetes, the trends were similar, apart from ethnicity, where the opposite was observed as for QRISK3.
Conclusions
Inclusion of ethnicity and deprivation in the models may have contributed to more consistent calibration across subgroups. Variables were measured only at baseline, while many social and economic conditions can shift over time. Although including social determinants of health can improve predictive performance, the models are typically deployed in a clinical setting to guide individual-level decision making, which can inadvertently shift responsibility for health outcomes onto patients.
| Original language | English |
|---|---|
| Article number | ckaf180.324 |
| Journal | European Journal of Public Health |
| Volume | 35 |
| Issue number | Supplement_6 |
| Number of pages | 1 |
| ISSN | 1101-1262 |
| DOIs | |
| Publication status | Published - 2025 |