Authors: | |
Lijun Zhang | Nanjing University (NJU) |
Zhi-Hua Zhou | Nanjing University |
Introduction:
In this paper, the authors consider the problem of linear regression with heavy-tailed distributions.To address the challenge that both the input and output could be heavy-tailed, the authors propose a truncated minimization problem, and demonstrate that it enjoys an $O(sqrt{d/n})$ excess risk, where $d$ is the dimensionality and $n$ is the number of samples.
Abstract:
In this paper, we consider the problem of linear regression with heavy-tailed distributions. Different from previous studies that use the squared loss to measure the performance, we choose the absolute loss, which is capable of estimating the conditional median. To address the challenge that both the input and output could be heavy-tailed, we propose a truncated minimization problem, and demonstrate that it enjoys an $O(\sqrt{d/n})$ excess risk, where $d$ is the dimensionality and $n$ is the number of samples. Compared with traditional work on $\ell_1$-regression, the main advantage of our result is that we achieve a high-probability risk bound without exponential moment conditions on the input and output. Furthermore, if the input is bounded, we show that the classical empirical risk minimization is competent for $\ell_1$-regression even when the output is heavy-tailed.