Estimates of the basic reproduction number for rubella using seroprevalence data and indicator-based approaches

2021 
The basic reproduction number (R0) of an infection determines the impact of its control. For many endemic infections, R0 is often estimated from appropriate country-specific seroprevalence data. Studies sometimes pool estimates from the same region for settings lacking seroprevalence data, but the reliability of this approach is unclear. Plausibly, indicator-based approaches could predict R0 for such settings. We calculated R0 for rubella for 98 settings and correlated its value against 66 demographic, economic, education, housing and health-related indicators. We also trained a random forest regression algorithm using these indicators as the input and R0 as the output. We used the mean-square error to compare the performances of the random forest, simple linear regression and a regional averaging method in predicting R0 using 4-fold cross validation. R0 was 10 for 81, 14 and 3 settings respectively, with no apparent regional differences and in the limited available data, it was usually lower for rural than urban areas. R0 was most correlated with educational attainment, and household indicators for the Pearson and Spearman correlation coefficients respectively and with poverty-related indicators followed by the crude death rate considering the Maximum Information Coefficient, although the correlation for each was relatively weak (Pearson correlation coefficient: 0.4, 95%CI: (0.24,0.48) for educational attainment). A random forest did not perform better in predicting R0 than simple linear regression, depending on the subsets of training indicators and studies, and neither out-performed a regional averaging approach. R0 for rubella is typically low and using indicators to estimate its value is not straightforward. A regional averaging approach may provide as reliable an estimate of R0 for settings lacking seroprevalence data as one based on indicators. The findings may be relevant for other infections and studies estimating the disease burden and the impact of interventions for settings lacking seroprevalence data. Author SummaryThe basic reproduction number (R0) of an infection, defined as the average number of secondary infectious people resulting from the introduction of an infectious person into a totally susceptible, determines how easily the infection can be controlled. For many established endemic infections, R0 is estimated using data describing the presence of antibodies in a population obtained prior to the introduction of vaccination in that population (prevaccination seroprevalence data). For countries lacking such data the estimation is often done by pooling estimates from their geographical region. We estimated R0 for rubella for 98 settings with existing prevaccination seroprevalence data and we investigated the effectiveness of using simple machine learning regression methods to predict R0 from 66 demographic, economic, education, housing and health-related indicators in those same settings. Our results suggest that the indicator data and prediction methods under investigation do not perform better than regional pooling. We discuss possible ways of improving the prediction accuracy. Since research on predicting R0 using socio-economic data is very scarce, our findings may also be relevant to estimating the disease burden and the impact of interventions in other pathogens.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []