Sharp Bounds For Generalized Uniformity Testing

Authors:
Ilias Diakonikolas University of Southern California
Daniel M. Kane UCSD
Alistair Stewart University of Southern California

Introduction:

The authors study the problem of generalized uniformity testing of a discrete probability distribution: Given samples from a probability distribution p over an unknown size discrete domain Ω, the authors want to distinguish, with probability at least 2/3, between the case that p is uniform on some subset of Ω versus ε-far, in total variation distance, from any such uniform distribution.

Abstract:

We study the problem of generalized uniformity testing of a discrete probability distribution: Given samples from a probability distribution p over an unknown size discrete domain Ω, we want to distinguish, with probability at least 2/3, between the case that p is uniform on some subset of Ω versus ε-far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, within constant factors, and a matching worst-case information-theoretic lower bound. Specifically, we show that the sample complexity of generalized uniformity testing is Θ(1/(ε^(4/3) ||p||3) + 1/(ε^2 ||p||2 )).

You may want to know: