An efficient fault-tolerant communication algorithm for population-based metaheuristics

2021 
Parallel and distributed computing systems have been seeing rapid growth in the number of processing cores as progress on single-core performance has stagnated. The larger the system, the greater the challenge for application scalability and system stability. Aiming at addressing both challenges in the context of distributed metaheuristic optimization algorithms, in this work, we propose a scalable and fault-tolerant peer-to-peer communication algorithm tailored for population-based metaheuristics. In the algorithm, messages exchanging are carried out by multiple threads asynchronously in background and the minimal algorithm's overhead can be entirely hidden by overlapping communication with computation. Results from controlled benchmarks corroborate the efficiency of the algorithm and also hint that thread oversubscription can further improve scalability thanks to the high degree of idleness of communication operations. The proposed algorithm contributes to the important yet not sufficiently explored performance aspects of distributed metaheuristics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    0
    Citations
    NaN
    KQI
    []