language-icon Old Web
English
Sign In

Confirmatory factor analysis

In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social research. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct (or factor). As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. This hypothesized model is based on theory and/or previous analytic research. CFA was first developed by Jöreskog and has built upon and replaced older methods of analyzing construct validity such as the MTMM Matrix as described in Campbell & Fiske (1959). In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social research. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct (or factor). As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. This hypothesized model is based on theory and/or previous analytic research. CFA was first developed by Jöreskog and has built upon and replaced older methods of analyzing construct validity such as the MTMM Matrix as described in Campbell & Fiske (1959). In confirmatory factor analysis, the researcher first develops a hypothesis about what factors they believe are underlying the measures used (e.g., 'Depression' being the factor underlying the Beck Depression Inventory and the Hamilton Rating Scale for Depression) and may impose constraints on the model based on these a priori hypotheses. By imposing these constraints, the researcher is forcing the model to be consistent with their theory. For example, if it is posited that there are two factors accounting for the covariance in the measures, and that these factors are unrelated to one another, the researcher can create a model where the correlation between factor A and factor B is constrained to zero. Model fit measures could then be obtained to assess how well the proposed model captured the covariance between all the items or measures in the model. If the constraints the researcher has imposed on the model are inconsistent with the sample data, then the results of statistical tests of model fit will indicate a poor fit, and the model will be rejected. If the fit is poor, it may be due to some items measuring multiple factors. It might also be that some items within a factor are more related to each other than others. For some applications, the requirement of 'zero loadings' (for indicators not supposed to load on a certain factor) has been regarded as too strict. A newly developed analysis method, 'exploratory structural equation modeling', specifies hypotheses about the relation between observed indicators and their supposed primary latent factors while allowing for estimation of loadings with other latent factors as well. In confirmatory factor analysis, researchers are typically interested in studying the degree to which responses on a p x 1 vector of observable random variables can be used to assign a value to one or more unobserved variable(s) η. The investigation is largely accomplished by estimating and evaluating the loading of each item used to tap aspects of the unobserved latent variable. That is, y is the vector of observed responses predicted by the unobserved latent variable ξ {displaystyle xi } , which is defined as: Y = Λ ξ + ϵ {displaystyle Y=Lambda xi +epsilon } , where Y {displaystyle Y} is the p x 1 vector of observed random variables, ξ {displaystyle xi } is the unobserved latent variables, or variables in the multidimensional case, and Λ {displaystyle Lambda } is a p x k matrix with k equal to the number of latent variables. Since, Y {displaystyle Y} are imperfect measures of ξ {displaystyle xi } , the model also consists of error, ϵ {displaystyle epsilon } . Estimates in the maximum likelihood (ML) case generated by iteratively minimizing the fit function, F M L = l n | Λ Ω Λ ′ + I − d i a g ( Λ Ω Λ ′ ) | + t r ( R ( Λ Ω Λ ′ + I − d i a g ( Λ Ω Λ ′ ) − 1 ) − l n ( R ) − p {displaystyle F_{ML}=ln|Lambda Omega Lambda {'}+I-diag(Lambda Omega Lambda {'})|+tr(R(Lambda Omega Lambda {'}+I-diag(Lambda Omega Lambda {'})^{-1})-ln(R)-p} where Λ Ω Λ ′ + I − d i a g ( Λ Ω Λ ′ ) {displaystyle Lambda Omega Lambda {'}+I-diag(Lambda Omega Lambda {'})} is the variance-covariance matrix implied by the proposed factor analysis model and R {displaystyle R} is the observed variance-covariance matrix. That is, values are found for freed model parameters that minimizes the difference between the model-implied variance-covariance matrix and observed variance-covariance matrix. Although numerous algorithms have been used to estimate CFA models, maximum likelihood (ML) remains the primary estimation procedure. That being said, CFA models are often applied to data conditions that deviate from the normal theory requirements for valid ML estimation. For example, social scientists often estimate CFA models with non-normal data and indicators scaled using discrete ordered categories. Accordingly, alternative algorithms have been developed that attend to the diverse data conditions applied researchers encounter. The alternative estimators have been characterized into two general type: (1) robust and (2) limited information estimator.

[ "Clinical psychology", "Social psychology", "Statistics", "Machine learning", "Developmental psychology", "Gifted Rating Scales", "factorial invariance", "factorial validity", "Measurement invariance", "Drinking Refusal Self-Efficacy Questionnaire" ]
Parent Topic
Child Topic
    No Parent Topic