Effective rank for multivariate calibration methods

2004 
In order to determine the proper multivariate calibration model, it is necessary to select the number of respective basis vectors (latent vectors, factors, etc.) when using principal component regression (PCR) or partial least squares (PLS). These values are commonly referred to as the prediction rank of the model. Comparisons between PCR and PLS models for a given data set are often made with the prediction rank to determine the more parsimonious model, ignoring the fact that the values have been obtained using different basis sets. Additionally, it is not possible to use this approach for determining the prediction rank of models generated by other modeling methods such as ridge regression (RR). This paper presents measures of effective rank for a given model that can be applied to all modeling methods, thereby providing inter-model comparisons. A definition based on the regression vector norm and is compared with two alternative forms from the literature. With a proper definition of effective rank, a better assessment of degrees of freedom for statistical computations is possible. Additionally, the true nature of variable selection for improved parsimony can be properly assessed. Spectroscopic data sets are used as examples with PCR, PLS and RR. Copyright © 2004 John Wiley & Sons, Ltd.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    25
    Citations
    NaN
    KQI
    []