Soft Windowing Application to Improve Analysis of High-throughput Phenotyping Data

2019 
High-throughput phenomic projects typically generate complex data from small treatment and large control groups. These control groups increase the power of the analyses but introduce variation over time. A method is needed to locally select controls that maximise the analytic power while minimising the noise level from unspecified environmental factors. Here we introduce "soft windowing", a methodological approach that selects a window of time to accommodates the most appropriate controls for analysis. Using phenotype data from the International Mouse Phenotyping Consortium (IMPC), adaptive windows are applied so that control data collected locally to mutants are assigned the maximal weight, while data collected earlier or later has less weight. We apply this method to IMPC data and compare the results with those obtained by applying a standard non-windowed approach. Following a resampling approach in which samples of equal size and structure to that of mutants are drawn from control data, we demonstrate a 10% reduction of false positives from 2.5 million analyses. Further, we applied the method as part of the IMPC statistical pipeline that seeks to establish gene-phenotype associations by comparing mutants vs control data. We report an increase of 30% in the total significant p-values, as well as 106 vs 99 disease models with the soft-windowed and non-windowed approaches, respectively, from a set of 2,082 mutant mouse lines. Our method is generalisable and can benefit other large-scale phenomic projects such as the UK Biobank and the All of Us resources.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    0
    Citations
    NaN
    KQI
    []