The probability distribution of the reconstructed phylogenetic tree with occurrence data

2020 
Abstract We study the problem of computing the probability distribution of phylogenetic trees that commonly arise in areas ranging from epidemiology to macroevolution. We focus on homogeneous birth death trees with incomplete sampling and consider observations from three distinct sampling schemes. First, individuals can be sampled and removed, through time, and included in the tree. Second, they can be occurrences which are sampled and removed through time and not included in the tree. Third, extant individuals can be sampled and included in the tree. The outcome of the process is thus composed of the reconstructed phylogenetic tree spanning all individuals sampled and included in the tree, and a timeline of occurrence events which are not placed along the tree. We derive a formula for computing the joint probability density of this outcome, which can readily be used to perform maximum likelihood or Bayesian estimation of the parameters of the model. In the context of epidemiology, our probability density enables the estimation of transmission rates through a joint analysis of epidemiological case count data and phylogenetic trees reconstructed from pathogen sequences. Within macroevolution, our equations form the basis for incorporating fossil occurrences from paleontological databases together with extant species phylogenies for estimating speciation and extinction rates. This work provides the theoretical framework for bridging not only the gap between phylogenetics and epidemiology, but also that between phylogenetics and paleontology.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    5
    Citations
    NaN
    KQI
    []