Novel Approach for Parallelizing Pairwise Comparison Problems as Applied to Detecting Segments Identical By Decent in Whole-Genome Data

2020 
Motivation Pairwise comparison problems arise in many areas of science. In genomics, datasets are already large and getting larger, and so operations that require pairwise comparisons—either on pairs of SNPs or pairs of individuals—are extremely computationally challenging. We propose a generic algorithm for addressing pairwise comparison problems that breaks a large problem (of order n2 comparisons) into multiple smaller ones (each of order n comparisons), allowing for massive parallelization.Results We demonstrated that this procedure is very efficient for calling identical by descent (IBD) segments between all pairs of individuals in the UK Biobank dataset, with a user time savings roughly 180-fold over the traditional (non-parallel) approach to detecting such segments. This efficiency should extend to other methods of IBD calling and, more generally, to other pairwise comparison tasks in genomics or other areas of science.Contact emmanuel.sapin{at}colorado.eduCompeting Interest StatementThe authors have declared no competing interest.View Full Text
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    1
    Citations
    NaN
    KQI
    []