Validation of an Internationally Derived Patient Severity Phenotype to Support COVID-19 Analytics from Electronic Health Record Data.

Jeffrey G. Klann,Griffin M. Weber,Hossein Estiri,Bertrand Moal,Paul Avillach,Chuan Hong,Victor Castro,Thomas Maulhardt,Amelia L. M. Tan,Alon Geva,Brett K. Beaulieu-Jones,Alberto Malovini,Andrew M. South,Shyam Visweswaran,Gilbert S. Omenn,Kee Yuan Ngiam,Kenneth D. Mandl,Martin Boeker,Karen L. Olson,Danielle L Mowery,Michele Morris,Robert W. Follett,David A. Hanauer,Riccardo Bellazzi,Jason H. Moore,Ne-Hooi Will Loh,Douglas S. Bell,Kavishwar B. Wagholikar,Luca Chiovato,Valentina Tibollo,Siegbert Rieg,Anthony L.L.J. Li,Vianney Jouhet,Emily Schriver,Malarkodi J. Samayamuthu,Zongqi Xia,Meghan Hutch,Yuan Luo,Isaac S. Kohane,Gabriel A. Brat,Shawn N. Murphy

Validation of an Internationally Derived Patient Severity Phenotype to Support COVID-19 Analytics from Electronic Health Record Data.

2021

Introduction The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing COVID-19 with federated analyses of electronic health record (EHR) data. Objective We sought to develop and validate a computable phenotype for COVID-19 severity. Methods Twelve 4CE sites participated. First we developed an EHR-based severity phenotype consisting of six code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also piloted an alternative machine-learning approach and compared selected predictors of severity to the 4CE phenotype at one site. Results The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability - up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean AUC 0.903 (95% CI: 0.886, 0.921), compared to AUC 0.956 (95% CI: 0.952, 0.959) for the machine-learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared to chart review. Discussion We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine-learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly due to heterogeneous pandemic conditions. Conclusion We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations