Impact of Crowdsourced Data Quality on Travel Pattern Estimation

2017 
Mobile crowdsensing can provide mobility researchers with fine grained spatio-temporal location data. But crowdsourcing impacts data quality both due to device and OS heterogeneity, and to annotation errors. Additionally, it is often necessary to deal with multimodality, i.e. participants using different travel modes often in the same trip. In this paper, we address how to draw value from a crowdsensed dataset for characterising mobility demand through origin-destination (OD) matrices, highlighting challenges and providing some solutions. First, we identify typical errors in heterogeneous location data, propose and compare methods to automatically improve data quality. Then, we devise a method to detect among 5 transport modes (walk, car, bus, metro, bike) offline a posteriori. We segment trips on stopped periods and propose a random forest model to detect transportation mode per segment using only location data. Our results show that with adequate pre-processing and robust features, an RF classifier is able to achieve accuracy and precision of 85% in trip segments. This is similar to the literature, but our work uses a very heterogeneous crowdsourced trajectory dataset when compared to the others. Finally, we quantify the impact of the model on mulit-modal OD matrices and whole trip characterisation. We can correctly identify used transportation modes accurately, but the precision is impaired by the high likelihood of at least one false positive in the whole trip.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    8
    Citations
    NaN
    KQI
    []