Development and external validation of tools for categorizing diagnosis codes in international hospital data

Sarah L. Malecki*, Anne Loffler, Daniel Tamming, Niklas Dyrby Johansen, Tor Biering-Sørensen, Michael Fralick, Shahmir Sohail, Jessica Shi, Surain B. Roberts, Michael Colacci, Marwa Ismail, Fahad Razak, Amol A. Verma

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

2 Citations (Scopus)
10 Downloads (Pure)

Abstract

Background: The Clinical Classification Software Refined (CCSR) is a tool that groups many thousands of International Classification of Diseases 10th Revision (ICD-10) diagnosis codes into approximately 500 clinically meaningful categories, simplifying analyses. However, CCSR was developed for use in the United States and may not work well with other country-specific ICD-10 coding systems. Method: We developed an algorithm for semi-automated matching of Canadian ICD-10 codes (ICD-10-CA) to CCSR categories using discharge diagnoses from adult admissions at 7 hospitals between Apr 1, 2010 and Dec 31, 2020, and manually validated the results. We then externally validated our approach using inpatient hospital encounters in Denmark from 2017 to 2018. Key Results: There were 383,972 Canadian hospital admissions with 5,186 distinct ICD-10-CA diagnosis codes and 1,855,837 Danish encounters with 4,612 ICD-10 diagnosis codes. Only 46.6% of Canadian codes and 49.4% of Danish codes could be directly categorized using the official CCSR tool. Our algorithm facilitated the mapping of 98.5% of all Canadian codes and 97.7% of Danish codes. Validation of our algorithm by clinicians demonstrated excellent accuracy (97.1% and 97.0% in Canadian and Danish data, respectively). Without our algorithm, many common conditions did not match directly to a CCSR category, such as 96.6% of hospital admissions for heart failure. Conclusion: The GEMINI CCSR matching algorithm (available as an open-source package at https://github.com/GEMINI-Medicine/gemini-ccsr) improves the categorization of Canadian and Danish ICD-10 codes into clinically coherent categories compared to the original CCSR tool. We expect this approach to generalize well to other countries and enable a wide range of research and quality measurement applications.

Original languageEnglish
Article number105508
JournalInternational Journal of Medical Informatics
Volume189
Number of pages7
ISSN1386-5056
DOIs
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
© 2024 The Author(s)

Keywords

  • Algorithm
  • Clinical Classification Software Refined (CCSR)
  • Diagnosis codes
  • ICD-10
  • Validation

Cite this