A comparison of supervised machine learning algorithms for mosquito identification from backscattered optical signals

2020 
Abstract The surveillance of mosquito populations is paramount in the fight against mosquito-borne diseases that affect millions of people every year. Evaluating the efficiency of mitigation methods requires extensive and long-term surveys which can be costly and time consuming. The recent development of optical sensors give access to alternative methods for entomological monitoring but require efficient classification algorithms to be successful. In this contribution, supervised machine learning algorithms such as Linear Discriminant Analysis, Decision Trees, Support Vector Machine, K-Nearest Neighbors and Naive Bayes are compared for the identification of mosquitoes through optical signals. Based on predictor variables derived from the wing beat frequency and optical cross section of mosquitoes, these algorithms were trained to perform different classification tasks: the identification of species, sex and/or gravidity of mosquitoes present in New Jersey, USA. Results shows that the most polyvalent machine learning algorithm for mosquito identification is Support Vector Machine that performs on average over all tasks between 0.65 and 7.3% better than other algorithms. Moreover, Support Vector Machine is the algorithm that best performed for the most complex tasks, more than 2% above the second best, it is therefore the most suited for real-world study where several species of mosquito can be expected at a single location. A close second is Linear Discriminant Analysis that is only outperformed by Support Vector Machine by 0.65% over all tasks and is the most performant when studying mosquito gravidity. Finally, Decision Trees algorithm has reached almost perfection in identifying the sex of a single mosquito species with 99.9% accuracy which is 1.3% more than the second best performing algorithm on this task. These results demonstrate that optical sensors, coupled with machine learning, can be a viable alternative or complementary methodology for the monitoring of mosquito populations. Furthermore, this methodology can perform non-intrusive, automatic, time-resolved measurements of insect population dynamics over extended periods of time without the need for laboratory analysis of captured specimens as in most traditional survey methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    67
    References
    9
    Citations
    NaN
    KQI
    []