Enhancing Outlier Detection by Filtering Out Core Points and Border Points

2021 
Outlier detection is an important task in data mining and has high practical value in numerous applications such as astronomical observation, text detection, fraud detection, and so on. At present, a large number of popular outlier detection algorithms are available, including distribution-based, distance-based, density-based, and clustering-based approaches. However, traditional outlier detection algorithms face some challenges. For one example, most distance-based and density-based outlier detection methods are based on k-nearest neighbors. Therefore, even though the outlier data occupy a relatively small amount in the dataset, the existing approaches need to perform local outlier factor calculation on all data during the outlier detection, which greatly reduces the efficiency of the algorithms. For another example, some methods can only detect the global outliers, but fail to detect the local outliers. Last but not the least, most outlier detection algorithms do not accurately distinguish between boundary points and outliers. To partially solve these problems, it is realized that the outlier detection problem is related to the clustering problem by complementarity. According to density-based clustering, there are three kinds of data points, namely core points, border points, and outliers. If indicators can be extracted from the data that make outliers have much larger deviation values than the other two kinds of data points, outlier detection problems can be fulfilled. Therefore, in this chapter, we propose to augment some boundary indicators to classical outlier detection algorithms. Experiments performed on both synthetic and real data sets demonstrate the efficacy of enhanced outlier detection algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    0
    Citations
    NaN
    KQI
    []