Accelerating LSH-based Distributed Search with In-network Computation

2021 
Locality Sensitive Hashing (LSH) is widely adopted to index similar data in high-dimensional space for approximate nearest neighbor search. With the rapid increase of datasets, recent interests in LSH have moved to the implementation of distributed search systems with low response time and high throughput. However, as the scale of the concurrent queries and the volume of available data grow, large amounts of index messages still need to be transmitted to centralized servers for the candidate answer reducing and resorting. Hence, the network remains the bottleneck in distributed search systems.To address this gap, we turn our efforts to the network itself and propose NetSHa. NetSHa exploits the in-network computational capacity provided by programmable switches. Specially, NetSHa designs a sort-reduce approach to drop the potential poor candidate answers and aggregates the good candidate answers on programmable switches, while preserving the search quality. We implement NetSHa on Barefoot Tofino switches and evaluate it using 3 datasets (i.e., Random, Wiki and Image). The experimental results show that NetSHa reduces the packet volume by 10 times at most and improves the search efficiency by 3x at least, in comparison with typical LSH-based distributed search frameworks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []