Evaluating data linkage techniques for the analysis of bloodstream infection in paediatric intensive care

2014 
Errors that occur during linkage of individual-level data from different sources can lead to substantial bias in analyses of linked data. This thesis aims to develop methods for handling linkage error, and to evaluate these methods in the context of using linked administrative data to support randomised controlled trials. Firstly, the thesis describes the process required for linkage of national data on paediatric intensive care unit (PICU) admissions and bloodstream infection (BSI) surveillance (PICANet and LabBase2). I illustrate the complex steps required, from understanding the structure of the data to calculation of probabilistic match weights and evaluation of linkage quality. This provides a generalisable guide for linkage of administrative data in other contexts. Secondly, the thesis develops methods for handling uncertainty in linkage by extending the multiple imputation framework to the context of data linkage, using prior-informed imputation (PII). Comparison of results from traditional probabilistic linkage, standard multiple imputation and PII showed that PII minimised the bias associated with linkage of incomplete identifiers in PICANet and LabBase2. Finally, linked PICANet-LabBase2 data are used to assess the generalisability of results from a trial of standard versus impregnated central venous catheters (CVCs) in PICU. Trial results are not yet available. However assuming a relative risk of 0.06-0.44 for BSI using impregnated CVCs (based on a meta-analysis in adults), an estimated 163-311 BSI could be avoided in 2014 by using impregnated CVCs for all children in PICU. This thesis highlights the need to assess the impact of linkage error on results and demonstrates the importance of using alternative statistical methods such as PII for handling linkage error within analysis. This work addresses the challenges of exploiting administrative data for research and illustrates the value of linking these data to answer research questions that would otherwise not have been possible.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    3
    Citations
    NaN
    KQI
    []