Validation of a Derived International Patient Severity Algorithm to Support COVID-19 Analytics from Electronic Health Record Data

Jeffrey G. Klann,Griffin M. Weber,Hossein Estiri,Bertrand Moal,Paul Avillach,Chuan Hong,Víctor H. Castro,Thomas Maulhardt,Amelia Lm Tan,Alon Geva,Brett K. Beaulieu-Jones,Alberto Malovini,Andrew M. South,Shyam Visweswaran,Gilbert S. Omenn,Kee Yuan Ngiam,Kenneth D. Mandl,Martin Boeker,Karen L. Olson,Danielle L Mowery,Michele Morris,Robert W. Follett,David A. Hanauer,Riccardo Bellazzi,Jason H. Moore,Ne-Hooi Will Loh,Douglas S. Bell,Kavishwar B. Wagholikar,Luca Chiovato,Valentina Tibollo,Siegbert Rieg,Anthony L.L.J. Li,Vianney Jouhet,Emily Schriver,Malarkodi J. Samayamuthu,Zongqi Xia,Isaac S. Kohane,Gabriel A. Brat,Shawn N. Murphy

Validation of a Derived International Patient Severity Algorithm to Support COVID-19 Analytics from Electronic Health Record Data

2020

Introduction. The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) includes hundreds of hospitals internationally using a federated computational approach to COVID-19 research using the EHR. Objective. We sought to develop and validate a standard definition of COVID-19 severity from readily accessible EHR data across the Consortium. Methods. We developed an EHR-based severity algorithm and validated it on patient hospitalization data from 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also used a machine learning approach to compare selected predictors of severity to the 4CE algorithm at one site. Results. The 4CE severity algorithm performed with pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of single code categories for acuity were unacceptably inaccurate - varying by up to 0.65 across sites. A multivariate machine learning approach identified codes resulting in mean AUC 0.956 (95% CI: 0.952, 0.959) compared to 0.903 (95% CI: 0.886, 0.921) using expert-derived codes. Billing codes were poor proxies of ICU admission, with 49% precision and recall compared against chart review at one partner institution. Discussion. We developed a proxy measure of severity that proved resilient to coding variability internationally by using a set of 6 code classes. In contrast, machine-learning approaches may tend to overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold standard outcomes, possibly due to pandemic conditions. Conclusion. We developed an EHR-based algorithm for COVID-19 severity and validated it at 12 international sites.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations