Abstract
This study aimed to develop a deep learning model to assess the quality of fetal echocardiography and to perform prospective clinical validation. The model was trained on data from the 18–22-week anomaly scan conducted in seven hospitals from 2008 to 2018. Prospective validation involved 100 patients from two hospitals. A total of 5363 images from 2551 pregnancies were used for training and validation. The model's segmentation accuracy depended on image quality measured by a quality score (QS). It achieved an overall average accuracy of 0.91 (SD 0.09) across the test set, with images having above-average QS scoring 0.97 (SD 0.03). During prospective validation of 192 images, clinicians rated 44.8% (SD 9.8) of images as equal in quality, 18.69% (SD 5.7) favoring auto-captured images and 36.51% (SD 9.0) preferring manually captured ones. Images with above average QS showed better agreement on segmentations (p < 0.001) and QS (p < 0.001) with fetal medicine experts. Auto-capture saved additional planes beyond protocol requirements, resulting in more comprehensive echocardiographies. Low QS had adverse effect on both model performance and clinician’s agreement with model feedback. The findings highlight the importance of developing and evaluating AI models based on ‘noisy’ real-life data rather than pursuing the highest accuracy possible with retrospective academic-grade data.
Originalsprog | Engelsk |
---|---|
Artikelnummer | 5809 |
Tidsskrift | Scientific Reports |
Vol/bind | 14 |
Udgave nummer | 1 |
Antal sider | 9 |
ISSN | 2045-2322 |
DOI | |
Status | Udgivet - 2024 |
Bibliografisk note
Funding Information:The Project is supported by The Novo Nordisk Foundation through Grant NNFSA170030576, the Danish Regions’ AI Signature Project, through the Centre for Basic Machine Learning Research in Life Science (NNF20OC0062606), and the Pioneer Centre for AI, DNRF grant no. P1.
Publisher Copyright:
© The Author(s) 2024.