TY - JOUR
T1 - MOSAIC
T2 - An Artificial Intelligence-Based Framework for Multimodal Analysis, Classification, and Personalized Prognostic Assessment in Rare Cancers
AU - D'Amico, Saverio
AU - Dall'Olio, Lorenzo
AU - Rollo, Cesare
AU - Alonso, Patricia
AU - Prada-Luengo, Iñigo
AU - Dall'Olio, Daniele
AU - Sala, Claudia
AU - Sauta, Elisabetta
AU - Asti, Gianluca
AU - Lanino, Luca
AU - Maggioni, Giulia
AU - Campagna, Alessia
AU - Zazzetti, Elena
AU - Delleani, Mattia
AU - Bicchieri, Maria Elena
AU - Morandini, Pierandrea
AU - Savevski, Victor
AU - Arroyo, Borja
AU - Parras, Juan
AU - Zhao, Lin Pierre
AU - Platzbecker, Uwe
AU - Diez-Campelo, Maria
AU - Santini, Valeria
AU - Fenaux, Pierre
AU - Haferlach, Torsten
AU - Krogh, Anders
AU - Zazo, Santiago
AU - Fariselli, Piero
AU - Sanavia, Tiziana
AU - Della Porta, Matteo Giovanni
AU - Castellani, Gastone
N1 - Publisher Copyright:
© 2024 by American Society of Clinical Oncology.
PY - 2024
Y1 - 2024
N2 - PURPOSERare cancers constitute over 20% of human neoplasms, often affecting patients with unmet medical needs. The development of effective classification and prognostication systems is crucial to improve the decision-making process and drive innovative treatment strategies. We have created and implemented MOSAIC, an artificial intelligence (AI)-based framework designed for multimodal analysis, classification, and personalized prognostic assessment in rare cancers. Clinical validation was performed on myelodysplastic syndrome (MDS), a rare hematologic cancer with clinical and genomic heterogeneities.METHODSWe analyzed 4,427 patients with MDS divided into training and validation cohorts. Deep learning methods were applied to integrate and impute clinical/genomic features. Clustering was performed by combining Uniform Manifold Approximation and Projection for Dimension Reduction + Hierarchical Density-Based Spatial Clustering of Applications with Noise (UMAP + HDBSCAN) methods, compared with the conventional Hierarchical Dirichlet Process (HDP). Linear and AI-based nonlinear approaches were compared for survival prediction. Explainable AI (Shapley Additive Explanations approach [SHAP]) and federated learning were used to improve the interpretation and the performance of the clinical models, integrating them into distributed infrastructure.RESULTSUMAP + HDBSCAN clustering obtained a more granular patient stratification, achieving a higher average silhouette coefficient (0.16) with respect to HDP (0.01) and higher balanced accuracy in cluster classification by Random Forest (92.7% ± 1.3% and 85.8% ± 0.8%). AI methods for survival prediction outperform conventional statistical techniques and the reference prognostic tool for MDS. Nonlinear Gradient Boosting Survival stands in the internal (Concordance-Index [C-Index], 0.77; SD, 0.01) and external validation (C-Index, 0.74; SD, 0.02). SHAP analysis revealed that similar features drove patients' subgroups and outcomes in both training and validation cohorts. Federated implementation improved the accuracy of developed models.CONCLUSIONMOSAIC provides an explainable and robust framework to optimize classification and prognostic assessment of rare cancers. AI-based approaches demonstrated superior accuracy in capturing genomic similarities and providing individual prognostic information compared with conventional statistical methods. Its federated implementation ensures broad clinical application, guaranteeing high performance and data protection.
AB - PURPOSERare cancers constitute over 20% of human neoplasms, often affecting patients with unmet medical needs. The development of effective classification and prognostication systems is crucial to improve the decision-making process and drive innovative treatment strategies. We have created and implemented MOSAIC, an artificial intelligence (AI)-based framework designed for multimodal analysis, classification, and personalized prognostic assessment in rare cancers. Clinical validation was performed on myelodysplastic syndrome (MDS), a rare hematologic cancer with clinical and genomic heterogeneities.METHODSWe analyzed 4,427 patients with MDS divided into training and validation cohorts. Deep learning methods were applied to integrate and impute clinical/genomic features. Clustering was performed by combining Uniform Manifold Approximation and Projection for Dimension Reduction + Hierarchical Density-Based Spatial Clustering of Applications with Noise (UMAP + HDBSCAN) methods, compared with the conventional Hierarchical Dirichlet Process (HDP). Linear and AI-based nonlinear approaches were compared for survival prediction. Explainable AI (Shapley Additive Explanations approach [SHAP]) and federated learning were used to improve the interpretation and the performance of the clinical models, integrating them into distributed infrastructure.RESULTSUMAP + HDBSCAN clustering obtained a more granular patient stratification, achieving a higher average silhouette coefficient (0.16) with respect to HDP (0.01) and higher balanced accuracy in cluster classification by Random Forest (92.7% ± 1.3% and 85.8% ± 0.8%). AI methods for survival prediction outperform conventional statistical techniques and the reference prognostic tool for MDS. Nonlinear Gradient Boosting Survival stands in the internal (Concordance-Index [C-Index], 0.77; SD, 0.01) and external validation (C-Index, 0.74; SD, 0.02). SHAP analysis revealed that similar features drove patients' subgroups and outcomes in both training and validation cohorts. Federated implementation improved the accuracy of developed models.CONCLUSIONMOSAIC provides an explainable and robust framework to optimize classification and prognostic assessment of rare cancers. AI-based approaches demonstrated superior accuracy in capturing genomic similarities and providing individual prognostic information compared with conventional statistical methods. Its federated implementation ensures broad clinical application, guaranteeing high performance and data protection.
U2 - 10.1200/CCI.24.00008
DO - 10.1200/CCI.24.00008
M3 - Journal article
C2 - 38875514
AN - SCOPUS:85196242587
SN - 2473-4276
VL - 8
JO - JCO clinical cancer informatics
JF - JCO clinical cancer informatics
M1 - 00061
ER -