A lock-free algorithm of tree-based reduction for large scale clustering on GPGPU

2019 
Recently, the art of concurrency and parallelism has been advanced rapidly. However, conventional techniques still suffer of the drawback of lock contention. To name a few, atomic instruction has relatively low scalability as the number of iterations are increasing. This causes a serious slowdown when programmer cope with large-scale data mining processing such as clustering billions of data with numerous iterations. This paper proposes a Lock-free technique of tree-based reduction for large scale clustering on GPGPU. Proposal method is divided into two steps: fine reduction and coarse reduction. In the first reduction step, the clustering program obtain K * N intermediate array where K is the number of clusters and N is the number of blocks. In the following step, new mean value is calculated over N blocks. By doing this, the clustering program can evade using atomic instruction which causes lock contention in coping with large scale clusters. In experiment, the performance of native GPU kernel with atomic instruction, Thrust template libraries and proposal method is compared and evaluated.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    1
    Citations
    NaN
    KQI
    []