Scalable relevance ranking algorithm via semantic similarity assessment improves efficiency of medical chart review

Tianrun Cai,Zeling He,Chuan Hong,Yichi Zhang,Yuk Lam Ho,Jacqueline Honerlaw,Alon Geva,Vidul Ayakulangara Panickan,Amanda King,David R. Gagnon,Michael Gaziano,Kelly Cho,Katherine Liao,Tianxi Cai

Scalable relevance ranking algorithm via semantic similarity assessment improves efficiency of medical chart review

2022

Accurately assigning phenotype information to individual patients via computational phenotyping using Electronic Health Records (EHRs) has been seen as the first step towards enabling EHRs for precision medicine research. Chart review labels annotated by clinical experts, also known as “gold standard” labels, are essential for the development and validation of computational phenotyping algorithms. However, given the complexity of EHR systems, the process of chart review is both labor intensive and time consuming. We propose a fully automated algorithm, referred to as pGUESS, to rank EHR notes according to their relevance to a given phenotype. By identifying the most relevant notes, pGUESS can greatly improve the efficiency and accuracy of chart reviews.pGUESS uses prior guided semantic similarity to measure the informativeness of a clinical note to a given phenotype. We first select candidate clinical concepts from a pool of comprehensive medical concepts using public knowledge sources and then derive the semantic embedding vector (SEV) for a reference article () and each note (). The algorithm scores the relevance of a note as the cosine similarity between and .The algorithm was validated against four sets of 200 notes that were manually annotated by clinical experts to assess their informativeness to one of three disease phenotypes. pGUESS algorithm substantially outperform existing unsupervised approaches for classifying the relevance status with respect to both accuracy and scalability across phenotypes. Averaging over the three phenotypes, the rank correlation between the algorithm ranking and gold standard label was 0.64 for pGUESS, but only 0.47 and 0.35 for the next two best performing algorithms. pGUESS is also much more computationally scalable compared to existing algorithms.pGUESS algorithm can substantially reduce the burden of chart review and holds potential in improving the efficiency and accuracy of human annotation.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations