TY - JOUR
T1 - A deep learning algorithm to predict risk of pancreatic cancer from disease trajectories
AU - Placido, Davide
AU - Yuan, Bo
AU - Hjaltelin, Jessica X
AU - Zheng, Chunlei
AU - Haue, Amalie D
AU - Chmura, Piotr J
AU - Yuan, Chen
AU - Kim, Jihye
AU - Umeton, Renato
AU - Antell, Gregory
AU - Chowdhury, Alexander
AU - Franz, Alexandra
AU - Brais, Lauren
AU - Andrews, Elizabeth
AU - Marks, Debora S
AU - Regev, Aviv
AU - Ayandeh, Siamack
AU - Brophy, Mary T
AU - Do, Nhan V
AU - Kraft, Peter
AU - Wolpin, Brian M
AU - Rosenthal, Michael H
AU - Fillmore, Nathanael R
AU - Brunak, Søren
AU - Sander, Chris
N1 - © 2023. The Author(s).
PY - 2023
Y1 - 2023
N2 - Pancreatic cancer is an aggressive disease that typically presents late with poor outcomes, indicating a pronounced need for early detection. In this study, we applied artificial intelligence methods to clinical data from 6 million patients (24,000 pancreatic cancer cases) in Denmark (Danish National Patient Registry (DNPR)) and from 3 million patients (3,900 cases) in the United States (US Veterans Affairs (US-VA)). We trained machine learning models on the sequence of disease codes in clinical histories and tested prediction of cancer occurrence within incremental time windows (CancerRiskNet). For cancer occurrence within 36 months, the performance of the best DNPR model has area under the receiver operating characteristic (AUROC) curve = 0.88 and decreases to AUROC (3m) = 0.83 when disease events within 3 months before cancer diagnosis are excluded from training, with an estimated relative risk of 59 for 1,000 highest-risk patients older than age 50 years. Cross-application of the Danish model to US-VA data had lower performance (AUROC = 0.71), and retraining was needed to improve performance (AUROC = 0.78, AUROC (3m) = 0.76). These results improve the ability to design realistic surveillance programs for patients at elevated risk, potentially benefiting lifespan and quality of life by early detection of this aggressive cancer.
AB - Pancreatic cancer is an aggressive disease that typically presents late with poor outcomes, indicating a pronounced need for early detection. In this study, we applied artificial intelligence methods to clinical data from 6 million patients (24,000 pancreatic cancer cases) in Denmark (Danish National Patient Registry (DNPR)) and from 3 million patients (3,900 cases) in the United States (US Veterans Affairs (US-VA)). We trained machine learning models on the sequence of disease codes in clinical histories and tested prediction of cancer occurrence within incremental time windows (CancerRiskNet). For cancer occurrence within 36 months, the performance of the best DNPR model has area under the receiver operating characteristic (AUROC) curve = 0.88 and decreases to AUROC (3m) = 0.83 when disease events within 3 months before cancer diagnosis are excluded from training, with an estimated relative risk of 59 for 1,000 highest-risk patients older than age 50 years. Cross-application of the Danish model to US-VA data had lower performance (AUROC = 0.71), and retraining was needed to improve performance (AUROC = 0.78, AUROC (3m) = 0.76). These results improve the ability to design realistic surveillance programs for patients at elevated risk, potentially benefiting lifespan and quality of life by early detection of this aggressive cancer.
U2 - 10.1038/s41591-023-02332-5
DO - 10.1038/s41591-023-02332-5
M3 - Journal article
C2 - 37156936
VL - 29
SP - 1113
EP - 1122
JO - Nature Medicine
JF - Nature Medicine
SN - 1078-8956
ER -