Fast Rare Category Detection Using Nearest Centroid Neighborhood

2016 
Rare category detection is an open challenge in data mining. The existing approaches to this problem often have some flaws, such as inappropriate investigation scopes, high time complexity, and limited applicable conditions, which will degrade their performance and reduce their usability. In this paper, we present FRANC an effective and efficient solution for rare category detection. It adopts an investigation scope based on k-nearest centroid neighbors with an automatically selected k, which helps the algorithm capture the real changes on local densities and data distribution caused by the presence of rare categories. By using our proposed pruning method, the identification of k-nearest centroid neighbors, which is the most computationally expensive step in FRANC, will be much faster for each data example. Extensive experimental results on real data sets demonstrate the effectiveness and efficiency of FRANC.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    3
    Citations
    NaN
    KQI
    []