Reflections on univariate and multivariate analysis of metabolomics data

2014 
Metabolomics experiments usually result in a large quantity of data. Univariate and multivariate analysis techniques are routinely used to extract relevant information from the data with the aim of providing biological knowledge on the problem studied. Despite the fact that statistical tools like the t test, analysis of variance, principal component analysis, and partial least squares discriminant analysis constitute the backbone of the statistical part of the vast majority of metabolomics papers, it seems that many basic but rather fundamental questions are still often asked, like: Why do the results of univariate and multivariate analyses differ? Why apply univariate methods if you have already applied a multivariate method? Why if I do not see something univariately I see something multivariately? In the present paper we address some aspects of univariate and multivariate analysis, with the scope of clarifying in simple terms the main differences between the two approaches. Applications of the t test, analysis of variance, principal component analysis and partial least squares discriminant analysis will be shown on both real and simulated metabolomics data examples to provide an overview on fundamental aspects of univariate and multivariate methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    62
    References
    273
    Citations
    NaN
    KQI
    []