Point-to-Hyperplane Nearest Neighbor Search Beyond the Unit Hypersphere

2021 
Point-to-Hyperplane Nearest Neighbor Search (P2HNNS) is a fundamental yet challenging problem, and it has plenty of applications in various fields. Existing hyperplane hashing schemes enjoy sub-linear query time and achieve excellent performance on applications such as large-scale active learning with Support Vector Machines (SVMs). However, they only conditionally deal with this problem with a strong assumption that all of the data objects are normalized, located at the unit hypersphere. Those hyperplane hashing schemes may be arbitrarily bad without this assumption. In this paper, we introduce a new asymmetric transformation and develop the first two provable hyperplane hashing schemes, Nearest Hyperplane hashing (NH) and Furthest Hyperplane hashing (FH), for high-dimensional P2HNNS beyond the unit hypersphere. With this asymmetric transformation, we demonstrate that the hash functions of NH and FH are locality-sensitive to the hyperplane queries, and both of them enjoy quality guarantee on query results. Moreover, we propose a data-dependent multi-partition strategy to boost the search performance of FH. NH can perform the hyperplane queries in sub-linear time, while FH enjoys a better practical performance. We evaluate NH and FH over five real-life datasets and show that we are around $3 \sim 100 \times$ faster than the best competitor in four out of five datasets, especially for the recall in $[20%, 80%]$. Code is available at \urlhttps://github.com/HuangQiang/P2HNNS.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    50
    References
    0
    Citations
    NaN
    KQI
    []