A sparse bayesian model selection algorithm for forecasting the transmission of COVID-19

2021 
Introduction: Many variations of the Kermack-McKendrick SIR model were proposed in the early stages of the SARS-CoV-2 pandemic to study the transmission of COVID-19. The current state-of-the-art 16 compartment model developed by Tuite et. al (2020) is used to simulate the influence of government policies and leverage early available clinical information to predict the dynamics of the disease. As much of the world is now experiencing a second wave and vaccines have been approved and are being deployed;it is critical to be able to accurately predict the trajectory of cases while integrating information about these new model states and parameters. Challenges for accurate predictions are two-fold: firstly, the mechanistic model must capture the essential dynamics of the pandemic as well provide meaningful information on quantities of interest (e.g. demand for hospital resources), and secondly, the model parameters need to be calibrated using epidemiological and clinical data. Methods: To address the first challenge, we propose a compartmental model that expands upon model developed by Tuite et al. (2020) to capture the effects of vaccination, reinfection, asymptomatic carriers, inadequate access to hospital resources, and long-term health complications. As the complexity of the model increases, the inference task becomes more difficult and prone to over-fitting. As such, the nonlinear sparse Bayesian learning (NSBL) algorithm is proposed for parameter estimation. Results: The algorithm is demonstrated for noisy and incomplete synthetic data generated from an SIRS model with three uncertain parameters (infection rate, recovery rate and the rate temporary immunity is lost). As an example, Figure 1 shows the calibration of the three uncertain model parameters within a Bayesian framework while avoiding over-fitting by inducing sparsity in the parameters. Assuming there is little prior information available for the parameters, they are first assigned non-informative priors. Before NSBL, the model (red curve) is over-parameterized, and fails to predict the decline of the (blue) infection curve. The NSBL algorithm makes use of automatic relevance determination (ARD) priors, and finds one of the model parameters to be irrelevant to the dynamics. Removing the irrelevant parameter and re-calibrating enables the model (green curve) to capture the peak of the infection curve. Conclusion: An optimally calibrated model will allow for the concurrent forecasting of many hypothetical scenarios and provide clinically relevant predictions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []