A Framework for Automated Extraction of Information for Clinical Decision Support from Images of Flow Charts and Decision Trees

2020 
Decision trees (DTs) are ubiquitous, indispensable tools for clinicians in a point of care setting, yet little progress has been made in techniques to automatically analyze them. Here we report results from DT-DECON, a general method to extract information and information flow from decision tree images, classify them according to automatically generated, maximally discriminative categories, and index them according to localized information content. Our pipeline uses a combination of feature engineering and machine learning methods to take image files and produce a corresponding directed graph, where nodes correspond to boxes and edges correspond to arrows. This serves as a standardized representation which is readily machine interpretable and can be applied. We develop a methodology to classify constituent nodes in these graphs according to their processual role, and use aggregate semantic categorization for nodes of each type to generate and determine class membership for any given population of decision trees. Relevant subgraphs are then enumerated by considering all possible truncated traversals of a decision tree. These paths are stored in tabular form according to bifurcation results, which facilitates use of traditional indexing procedures to enable return of specifically relevant subtrees. This algorithm is applied to several classic medical texts within the Elsevier corpus and a hand curated corpus of COVID-19 specific decision trees. Several querying methods are demonstrated, including a fuzzy tokenized search, a universal sentence embedding based question relevancy retrieval system, and a chatbot that can navigate decision trees based off user input.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []