A Fast Parallel Particle Filter for Shared Memory Systems

2020 
Particle Filters (PFs) are Sequential Monte Carlo methods which are widely used to solve filtering problems of dynamic models under Non-Linear Non-Gaussian noise. Modern PF applications have demanding accuracy and run-time constraints that can be addressed through parallel computing. However, an efficient parallelization of PFs can only be achieved by effectively parallelizing the bottleneck: resampling and its constituent redistribution step. A pre-existing implementation of redistribute on Shared Memory Architectures (SMAs) achieves $O(\frac{N}{T}log_2N)$ time complexity over $T$ parallel cores. This redistribute implementation is, however, highly computationally intensive and cannot be effectively parallelized due to the inherently limited number of cores of SMAs. In this paper, we propose a novel parallel redistribute on OpenMP 4.5 which takes $O(\frac{N}{T} + log_2N)$ steps and fully exploits the computational power of SMAs. The proposed approach is up to six times faster than the $O(\frac{N}{T}log_2N)$ one and its implementation on GPU provides a further three-time speed-up vs its equivalent on a 32-core CPU. We also show on an exemplary PF that our redistribution is no longer the bottleneck.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    1
    Citations
    NaN
    KQI
    []