language-icon Old Web
English
Sign In

Exploratory factor analysis

In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify the underlying relationships between measured variables. It is commonly used by researchers when developing a scale (a scale is a collection of questions used to measure a particular research topic) and serves to identify a set of latent constructs underlying a battery of measured variables. It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables. Measured variables are any one of several attributes of people that may be observed and measured. Examples of a measured variables could be the physical height, weight, and pulse rate of a human being. Usually, researchers would have large number of measured variables, which are assumed to be related to a smaller number of 'unobserved' factors. Researchers must carefully consider the number of measured variables to include in the analysis. EFA procedures are more accurate when each factor is represented by multiple measured variables in the analysis. In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify the underlying relationships between measured variables. It is commonly used by researchers when developing a scale (a scale is a collection of questions used to measure a particular research topic) and serves to identify a set of latent constructs underlying a battery of measured variables. It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables. Measured variables are any one of several attributes of people that may be observed and measured. Examples of a measured variables could be the physical height, weight, and pulse rate of a human being. Usually, researchers would have large number of measured variables, which are assumed to be related to a smaller number of 'unobserved' factors. Researchers must carefully consider the number of measured variables to include in the analysis. EFA procedures are more accurate when each factor is represented by multiple measured variables in the analysis. EFA is based on the common factor model. In this model, manifest variables are expressed as a function of common factors, unique factors, and errors of measurement. Each unique factor influences only one manifest variable, and does not explain correlations between manifest variables. Common factors influence more than one manifest variable and 'Factor loadings' are measures of the influence of a common factor on a manifest variable. For the EFA procedure, we are more interested in identifying the common factors and the related manifest variables. EFA assumes that any indicator/measured variable may be associated with any factor. When developing a scale, researchers should use EFA first before moving on to confirmatory factor analysis (CFA). EFA is essential to determine underlying factors/constructs for a set of measured variables; while CFA allows the researcher to test the hypothesis that a relationship between the observed variables and their underlying latent factor(s)/construct(s) exists. EFA requires the researcher to make a number of important decisions about how to conduct the analysis because there is no one set method. Fitting procedures are used to estimate the factor loadings and unique variances of the model (Factor loadings are the regression coefficients between items and factors and measure the influence of a common factor on a measured variable). There are several factor analysis fitting methods to choose from, however there is little information on all of their strengths and weaknesses and many don’t even have an exact name that is used consistently. Principal axis factoring (PAF) and maximum likelihood (ML) are two extraction methods that are generally recommended. In general, ML or PAF give the best results, depending on whether data are normally-distributed or if the assumption of normality has been violated. The maximum likelihood method has many advantages in that it allows researchers to compute of a wide range of indexes of the goodness of fit of the model, it allows researchers to test the statistical significance of factor loadings, calculate correlations among factors and compute confidence intervals for these parameters. ML is the best choice when data are normally distributed because “it allows for the computation of a wide range of indexes of the goodness of fit of the model permits statistical significance testing of factor loadings and correlations among factors and the computation of confidence intervals”.

[ "Clinical psychology", "Social psychology", "Statistics", "Machine learning", "Developmental psychology", "principal axis factoring", "scree plot", "kaiser mayer olkin" ]
Parent Topic
Child Topic
    No Parent Topic