Expected similarity estimation for large scale anomaly detection

2015 
We propose a new algorithm named EXPected Similarity Estimation (EXPoSE) to approach the problem of anomaly detection (also known as one-class learning or outlier detection) which is based on the similarity between data points and the distribution of non-anomalous data. We formulate the problem as an inner product in a reproducing kernel Hilbert space to which we present approximations that allow its application to very large-scale datasets. More precisely, given a dataset with n instances, our proposed method requires O(n) training time and O(1) to make a prediction while spending only O(1) memory to store the learned model. Despite its abstract derivation our algorithm is simple and parameter free. We show on seven real datasets that our approach can compete with state of the art algorithms for anomaly detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    7
    Citations
    NaN
    KQI
    []