Model-based Algorithms for Detecting Peripheral Artery Disease Using Administrative Data From an Electronic Health Record Data System (Preprint)

2020 
BACKGROUND Peripheral artery disease (PAD) affects 8-10 million Americans, who face significantly elevated risks of both mortality and major limb events (such as amputation). Unfortunately, PAD is relatively under-diagnosed, under-treated, and under-researched, leading to wide variations in treatment patterns and outcomes. Efforts to improve PAD care and outcomes have been hampered by persistent difficulties identifying PAD patients for clinical and investigatory purposes. OBJECTIVE The goal was to develop and validate a model-based algorithm to detect patients with peripheral artery disease (PAD) using data from an electronic health record (EHR) system. METHODS An initial query of the EHR in a large health system identified all patients with PAD-related diagnosis codes for any encounter during the study period. Clinical adjudication of PAD diagnosis was performed by chart review on a random subgroup. A binary logistic regression to predict PAD was built and validated using a Least Absolute Shrinkage and Selection Operator approach in the adjudicated patients. The algorithm was then applied to the non-sampled records to further evaluate its performance. RESULTS The initial EHR data query using 406 diagnostic codes yielded 15,406 patients. 2,500 patients were randomly selected for ground truth PAD status adjudication. 108 code flags remained after removing rarely- and never-used codes. We entered these code flags plus administrative encounter, imaging, procedure, and specialist flags into a LASSO model. The AUC for this model was 0.862. CONCLUSIONS The algorithm we constructed has two main advantages over other approaches to PAD patient identification. First, it was derived from a broad population of patients with many different PAD manifestations and treatment pathways across a large health system. Second, our model does not rely on clinical notes and can be applied in situations in which only administrative billing data (e.g. large administrative datasets) are available. A combination of diagnosis codes and administrative flags can accurately identify patients with PAD in large cohorts. CLINICALTRIAL
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    4
    Citations
    NaN
    KQI
    []