Test Sample Accuracy Scales with Training Sample Density in Neural Networks.

2021 
Intuitively, one would expect the accuracy of a trained neural network's prediction on a test sample to correlate with how densely that sample is surrounded by seen training samples in representation space. In this work we provide theory and experiments that support this hypothesis. We propose an error function for piecewise linear neural networks that takes a local region in the network's input space and outputs smooth empirical training error, which is an average of empirical training errors from other regions weighted by network representation distance. A bound on the expected smooth error for each region scales inversely with training sample density in representation space. Empirically, we verify this bound is a strong predictor of the inaccuracy of the network's prediction on test samples. For unseen test sets, including those with out-of-distribution samples, ranking test samples by their local region's error bound and discarding samples with the highest bounds raises prediction accuracy by up to 20% in absolute terms, on image classification datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []