AFA: Computationally efficient Ancestral Frequency estimation in Admixed populations: the Hispanic Community Health Study/Study of Latinos

2021 
We developed a computationally efficient method, Ancestral Frequency estimation in Admixed populations (AFA), to estimate the frequencies of bi-allelic variants in admixed populations with an unlimited number of ancestries. AFA uses maximum likelihood estimation by modeling the conditional probability of having an allele given proportions of genetic ancestries. It can be applied using either global or local proportions of genetic ancestries. Simulations mimicking admixture demonstrated the high accuracy of the method. We implemented the method on data from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), an admixed population with three predominant continental ancestries: Amerindian, European, and African. Comparison of the European and African estimated frequencies to the respective gnomAD frequencies demonstrated high correlations, with Pearson R2=0.97-0.99. We provide a genome-wide dataset of the estimated three ancestral allele frequencies in HCHS/SOL for all available variants with allele frequency between 5%-95% in at least one of the three ancestral populations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []