Regression-kriging

In applied statistics, regression-kriging (RK) is a spatial prediction technique that combines a regression of the dependent variable on auxiliary variables (such as parameters derived from digital elevation modelling, remote sensing/imagery, and thematic maps) with kriging of the regression residuals. It is mathematically equivalent to the interpolation method variously called universal kriging and kriging with external drift, where auxiliary predictors are used directly to solve the kriging weights. In applied statistics, regression-kriging (RK) is a spatial prediction technique that combines a regression of the dependent variable on auxiliary variables (such as parameters derived from digital elevation modelling, remote sensing/imagery, and thematic maps) with kriging of the regression residuals. It is mathematically equivalent to the interpolation method variously called universal kriging and kriging with external drift, where auxiliary predictors are used directly to solve the kriging weights. Regression-kriging is an implementation of the best linear unbiased predictor (BLUP) for spatial data, i.e. the best linear interpolator assuming the universal model of spatial variation. Matheron (1969) proposed that a value of a target variable at some location can be modeled as a sum of the deterministic and stochastic components: which he termed universal model of spatial variation. Both deterministic and stochastic components of spatial variation can be modeled separately. By combining the two approaches, we obtain: where m ^ ( s 0 ) {displaystyle {hat {m}}(mathbf {s} _{0})} is the fitted deterministic part, e ^ ( s 0 ) {displaystyle {hat {e}}(mathbf {s} _{0})} is the interpolated residual, β ^ k {displaystyle {hat {eta }}_{k}} are estimated deterministic model coefficients ( β ^ 0 {displaystyle {hat {eta }}_{0}} is the estimated intercept), λ i {displaystyle lambda _{i}} are kriging weights determined by the spatial dependence structure of the residual and where e ( s i ) {displaystyle e(mathbf {s} _{i})} is the residual at location s i {displaystyle {mathbf {s} }_{i}} . The regression coefficients β ^ k {displaystyle {hat {eta }}_{k}} can be estimated from the sample by some fitting method, e.g. ordinary least squares (OLS) or, optimally, using generalized least squares (GLS): where β ^ G L S {displaystyle mathbf {hat {eta }} _{mathtt {GLS}}} is the vector of estimated regression coefficients, C {displaystyle mathbf {C} } is the covariance matrix of the residuals, q {displaystyle {mathbf {q} }} is a matrix of predictors at the sampling locations and z {displaystyle mathbf {z} } is the vector of measured values of the target variable. The GLS estimation of regression coefficients is, in fact, a special case of the geographically weighted regression. In the case, the weights are determined objectively to account for the spatial auto-correlation between the residuals. Once the deterministic part of variation has been estimated (regression-part), the residual can be interpolated with kriging and added to the estimated trend. The estimation of the residuals is an iterative process: first the deterministic part of variation is estimated using OLS, then the covariance function of the residuals is used to obtain the GLS coefficients. Next, these are used to re-compute the residuals, from which an updated covariance function is computed, and so on. Although this is by many geostatisticians recommended as the proper procedure, Kitanidis (1994) showed that use of the covariance function derived from the OLS residuals (i.e. a single iteration) is often satisfactory, because it is not different enough from the function derived after several iterations; i.e. it does not affect much the final predictions. Minasny and McBratney (2007) report similar results—it seems that using more higher quality data is more important then to use more sophisticated statistical methods. In matrix notation, regression-kriging is commonly written as: where z ^ ( s 0 ) {displaystyle {hat {z}}({mathbf {s} }_{0})} is the predicted value at location s 0 {displaystyle {mathbf {s} }_{0}} , q 0 {displaystyle {mathbf {q} }_{mathbf {0} }} is the vector of p + 1 {displaystyle p+1} predictors and λ 0 {displaystyle mathbf {lambda } _{mathbf {0} }} is the vector of n {displaystyle n} kriging weights used to interpolate the residuals. The RK model is considered to be the Best Linear Predictor of spatial data. It has a prediction variance that reflects the position of new locations (extrapolation) in both geographical and feature space: where C 0 + C 1 {displaystyle C_{0}+C_{1}} is the sill variation and c 0 {displaystyle {mathbf {c} }_{0}} is the vector of covariances of residuals at the unvisited location.

Parent Topic

Child Topic

No Parent Topic