|Yogesh Dahiya||IIT Kanpur|
|David P. Woodruff||Carnegie Mellon University|
This paper studies randomized data dimensionality reduction. the goal of this work is to provide a comprehensive comparison of such methods to alternative approaches.
Over the last ten years, tremendous speedups for problems in randomized numerical linear algebra such as low rank approximation and regression have been made possible via the technique of randomized data dimensionality reduction, also known as sketching. In theory, such algorithms have led to optimal input sparsity time algorithms for a wide array of problems. While several scattered implementations of such methods exist, the goal of this work is to provide a comprehensive comparison of such methods to alternative approaches. We investigate least squares regression, iteratively reweighted least squares, logistic regression, robust regression with Huber and Bisquare loss functions, leverage score computation, Frobenius norm low rank approximation, and entrywise $\ell_1$-low rank approximation. We give various implementation techniques to speed up several of these algorithms, and the resulting implementations demonstrate the tradeoffs of such techniques in practice.