language-icon Old Web
English
Sign In

Leverage (statistics)

In statistics and in particular in regression analysis, leverage is a measure of how far away the independent variable values of an observation are from those of the other observations. In statistics and in particular in regression analysis, leverage is a measure of how far away the independent variable values of an observation are from those of the other observations. High-leverage points are those observations, if any, made at extreme or outlying values of the independent variables such that the lack of neighboring observations means that the fitted regression model will pass close to that particular observation. In the linear regression model, the leverage score for the i-th observation is defined as: the i-th diagonal element of the projection matrix H = X ( X T X ) − 1 X T {displaystyle mathbf {H} =mathbf {X} left(mathbf {X} ^{mathsf {T}}mathbf {X} ight)^{-1}mathbf {X} ^{mathsf {T}}} , where X {displaystyle mathbf {X} } is the design matrix (whose rows correspond to the observations and whose columns correspond to the independent or explanatory variables). The leverage score is also known as the observation self-sensitivity or self-influence, because of the equation which states that the leverage of the i-th observation equals the partial derivative of the fitted i-th dependent value y ^ i {displaystyle {widehat {y,}}_{i}} with respect to the measured i-th dependent value y i {displaystyle y_{i}} . This partial derivative describes the degree by which the i-th measured value influences the i-th fitted value. Note that this leverage depends on the values of the explanatory (x-) variables of all observations but not on any of the values of the dependent (y-) variables. The equation h i i = ∂ y ^ i ∂ y i {displaystyle h_{ii}={frac {partial {widehat {y,}}_{i}}{partial y_{i}}}} follows directly from the computation of the fitted values as y ^ = H y {displaystyle {mathbf {widehat {y}} }={mathbf {H} }{mathbf {y} }} . First, note that H is an idempotent matrix: H 2 = X ( X ⊤ X ) − 1 X ⊤ X ( X ⊤ X ) − 1 X ⊤ = X I ( X ⊤ X ) − 1 X ⊤ = H . {displaystyle H^{2}=X(X^{ op }X)^{-1}X^{ op }X(X^{ op }X)^{-1}X^{ op }=XI(X^{ op }X)^{-1}X^{ op }=H.} Also, observe that H {displaystyle H} is symmetric (i.e.: h i j = h j i {displaystyle h_{ij}=h_{ji}} ). So equating the ii element of H to that of H 2, we have

[ "Regression diagnostic", "Bayesian multivariate linear regression", "Cross-sectional regression", "Statistics", "Machine learning" ]
Parent Topic
Child Topic
    No Parent Topic