I-LSH: I/O Efficient c-Approximate Nearest Neighbor Search in High-Dimensional Space

2019 
Nearest Neighbor search has been well solved in low-dimensional space, but is challenging in high-dimensional space due to the curse of dimensionality. As a trade-off between efficiency and result accuracy, a variety of c-approximate nearest neighbor (c-ANN) algorithms have been proposed to return a c-approximate NN with confident at least δ. We observe that existing c-ANN search algorithms have some limitations on I/O efficiency when their indexes are resided on the external memory, which is critical for handling large scale high-dimensional data. In this paper, we introduce an incremental search based c-ANN search algorithm, named I-LSH. Unlike the previous LSH methods, which expand the bucket width in an exponential way, I-LSH adopts a more natural search strategy to incrementally access the hash values of the objects. We provide rigorous theoretical analysis to underpin our incremental search strategy. Our comprehensive experiment results show that, compared with state-of-the-art I/O efficient c-ANN techniques, our algorithm can achieve much better I/O efficiency under the same theoretical guarantee.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []