language-icon Old Web
English
Sign In

Ordinary least squares

In statistics, ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable (values of the variable being predicted) in the given dataset and those predicted by the linear function. In statistics, ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable (values of the variable being predicted) in the given dataset and those predicted by the linear function. Geometrically, this is seen as the sum of the squared distances, parallel to the axis of the dependent variable, between each data point in the set and the corresponding point on the regression surface – the smaller the differences, the better the model fits the data. The resulting estimator can be expressed by a simple formula, especially in the case of a simple linear regression, in which there is a single regressor on the right side of the regression equation. The OLS estimator is consistent when the regressors are exogenous, and optimal in the class of linear unbiased estimators when the errors are homoscedastic and serially uncorrelated. Under these conditions, the method of OLS provides minimum-variance mean-unbiased estimation when the errors have finite variances. Under the additional assumption that the errors are normally distributed, OLS is the maximum likelihood estimator. OLS is used in fields as diverse as economics (econometrics), data science, political science, psychology and engineering (control theory and signal processing). Suppose the data consists of n observations { yi, xi }ni=1. Each observation i includes a scalar response yi and a column vector xi of values of p predictors (regressors) xij for j = 1, ..., p. In a linear regression model, the response variable, y i {displaystyle y_{i}} , is a linear function of the regressors: or in vector form, where β is a p×1 vector of unknown parameters; the εi's are unobserved scalar random variables (errors) which account for influences upon the responses yi from sources other than the explanators xi; and x i {displaystyle x_{i}} is a column vector of the ith observations of all the explanatory variables. This model can also be written in matrix notation as where y and ε are n×1 vectors of the values of the response variable and the errors for the various observations, and X is an n×p matrix of regressors, also sometimes called the design matrix, whose row i is xiT and contains the ith observations on all the explanatory variables. As a rule, the constant term is always included in the set of regressors X, say, by taking xi1 = 1 for all i = 1, …, n. The coefficient β1 corresponding to this regressor is called the intercept.

[ "Statistics", "Machine learning", "Econometrics", "Regression analysis", "Least squares", "partitioned linear model" ]
Parent Topic
Child Topic
    No Parent Topic