Automatic topics extraction from crowdsourced cyclists near-miss and collision reports using text mining and Artificial Neural Networks

2021 
Abstract Cycling is an eco-friendly and sustainable mode of transportation. Despite its benefits, the cyclists’ risk of collision is still high when interacting with other road users. This study analyzed self-reported near-miss and collision event descriptions for the United States provided by the crowdsourcing platform, BikeMaps.org. Innovative and efficient analytic methods are needed to generate useful information from unstructured textual data sources in the transportation domain. In this study, explorative text mining, topic modeling, and machine learning are utilized to gain insights from the unstructured textual descriptions of crowdsourced near-miss and collision events. The approaches are used to unveil prevalent words and word associations for near-miss and collision events. Structural Topic Modeling (STM) is deployed to autogenerate latent themes or topics from the event descriptions. The generated topic proportions are used as input in Artificial Neural Networks (ANN) to estimate the cyclist’s propensity to a collision. It was found that cyclists had a higher propensity to a collision in topics that articulated vehicle encroachment to the bike lane, on-street parking close or into the bike lane resulting in dooring, and drivers’ violations at the crosswalk. The results and methodology used in this study can assist engineers, policymakers, and law enforcement officers to proactively reduce potential cyclist collisions, prioritizing areas where cyclist safety improvements are needed, and ultimately promoting bicycle ridership in our communities.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    0
    Citations
    NaN
    KQI
    []