AI supported fetal echocardiography with quality assessment

Caroline A. Taksoee-Vester*, Kamil Mikolaj, Zahra Bashir, Anders N. Christensen, Olav B. Petersen, Karin Sundberg, Aasa Feragen, Morten B.S. Svendsen, Mads Nielsen, Martin G. Tolsgaard

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

18 Downloads (Pure)

Abstract

This study aimed to develop a deep learning model to assess the quality of fetal echocardiography and to perform prospective clinical validation. The model was trained on data from the 18–22-week anomaly scan conducted in seven hospitals from 2008 to 2018. Prospective validation involved 100 patients from two hospitals. A total of 5363 images from 2551 pregnancies were used for training and validation. The model's segmentation accuracy depended on image quality measured by a quality score (QS). It achieved an overall average accuracy of 0.91 (SD 0.09) across the test set, with images having above-average QS scoring 0.97 (SD 0.03). During prospective validation of 192 images, clinicians rated 44.8% (SD 9.8) of images as equal in quality, 18.69% (SD 5.7) favoring auto-captured images and 36.51% (SD 9.0) preferring manually captured ones. Images with above average QS showed better agreement on segmentations (p < 0.001) and QS (p < 0.001) with fetal medicine experts. Auto-capture saved additional planes beyond protocol requirements, resulting in more comprehensive echocardiographies. Low QS had adverse effect on both model performance and clinician’s agreement with model feedback. The findings highlight the importance of developing and evaluating AI models based on ‘noisy’ real-life data rather than pursuing the highest accuracy possible with retrospective academic-grade data.

Original languageEnglish
Article number5809
JournalScientific Reports
Volume14
Issue number1
Number of pages9
ISSN2045-2322
DOIs
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
© The Author(s) 2024.

Cite this