Open Problem: Tight Convergence of SGD in Constant Dimension

Tomer Koren,Shahar Segal

Open Problem: Tight Convergence of SGD in Constant Dimension

2020

Tomer Koren
Shahar Segal

Stochastic Gradient Descent (SGD) is one of the most popular optimization methods in machine learning and has been studied extensively since the early 50’s. However, our understanding of this fundamental algorithm is still lacking in certain aspects. We point out to a gap that remains between the known upper and lower bounds for the expected suboptimality of the last SGD point whenever the dimension is a constant independent of the number of SGD iterations $T$, and in particular, that the gap is still unaddressed even in the one dimensional case. For the latter, we provide evidence that the correct rate is $\Theta(1/\sqrt{T})$ and conjecture that the same applies in any (constant) dimension.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations