Factor Analysis and Exploratory Factor Analysis (EFA)
Factor Analysis (FA)
- Factor analysis is used to identify coherent subsets within a single set of variables that are relatively independent of one another.
- Variables that correlate are combined into factors reflecting underlying processes.
- Example: Personality measures and motivation scales combining into an "independence factor".
- Used in psychology for objective test development (personality, intelligence).
- Process:
- Start with many items.
- Administer to participants.
- Derive factors.
- Add/delete items based on factor analysis results.
- Repeat until the test has items forming factors that represent the area being measured.
- Test validity by predicting behavior based on factor scores.
- Goals:
- Summarize correlation patterns.
- Reduce variables to fewer factors.
- Provide operational definitions for underlying processes.
- Test theories about underlying processes.
- Factor analysis reduces numerous variables to a few factors.
- Mathematically, factor analysis creates linear combinations of observed variables, each being a factor.
- Factors summarize correlations and can reproduce the observed correlation matrix.
- The number of factors is fewer than the observed variables, achieving parsimony.
- Factor scores are often more reliable than individual observed variable scores.
- Steps:
- Select and measure variables.
- Prepare the correlation matrix.
- Extract factors.
- Determine the number of factors.
- Rotate factors for interpretability.
- Interpret results.
- Interpretability is key; a good factor analysis is logical.
- A factor is easily interpreted when several variables correlate highly with it and not with other factors.
- Verify factor structure by establishing construct validity, showing that latent variable scores covary with other variables or change with experimental conditions as theory predicts.
- Problems:
- No readily available criteria to test the solution, unlike regression.
- Infinite number of rotations, all explaining the same variance but defining factors differently. The choice depends on interpretability.
- Researchers may differ on the best solution due to different decisions, causing replication issues.
- Factor analysis is sometimes used to salvage poorly designed research, contributing to its tarnished reputation.
- The ability of factor analysis to create order out of chaos contributes to its somewhat tarnished reputation as scientific tool.
- Types:
- Exploratory: describes and summarizes data by grouping correlated variables. Used early in research to consolidate variables and generate hypotheses.
- Confirmatory: tests theories about latent processes. Used in advanced stages with variables specifically chosen to reveal underlying processes. Often performed using structural equation modeling.
- Basic Terms:
- Observed correlation matrix: the correlation matrix of observed variables.
- Reproduced correlation matrix: correlation matrix implied by the factor solution.
- Residual correlation matrix: the difference between observed and reproduced correlation matrices. In a good factor analysis, correlations in the residual matrix should be small.
- Rotation: makes the solution more interpretable without altering mathematical properties.
- Orthogonal: factors are uncorrelated, producing a loading matrix (correlations between variables and factors). Factor interpretation relies on the loading matrix.
- Oblique: factors are correlated, producing:
- Factor correlation matrix: correlations among factors.
- Structure matrix: correlations between factors and variables.
- Pattern matrix: unique relationships between each factor and variable (used to ascertain factor meaning).
- Factor-score coefficients matrix: coefficients used to predict factor scores from observed variable scores.
- In factor analysis, only shared variance is analyzed; attempts are made to eliminate error and unique variance.
- Factors are thought to "cause" variables.
- Exploratory factor analysis is associated with theory development, while confirmatory factor analysis is associated with theory testing.
- Exploratory factor analysis question: "What underlying processes could have produced correlations among these variables?"
- Confirmatory factor analysis question: "Are the correlations among variables consistent with a hypothesized factor structure?"
Exploratory Factor Analysis (EFA)
- Exploratory factor analysis is an interdependence technique to define the underlying structure among variables.
- It identifies sets of interrelated variables (factors) assumed to represent dimensions within the data.
- Dimensions may guide composite measure creation or represent meaningful concepts.
- Example: Store atmosphere, defined by interrelated sensory components.
- The purpose of exploratory factor analysis is to condense information from original variables into composite dimensions (factors) with minimal information loss.
- Four key issues:
- Specifying the unit of analysis.
- Achieving data summarization/reduction.
- Variable selection.
- Using factor analysis results with multivariate techniques.
Specifying the Unit of Analysis
- Exploratory factor analysis identifies relationship structures among variables or respondents.
- R factor analysis: Applied to a correlation matrix of variables, aiming to identify latent dimensions.
- Q factor analysis: Applied to a correlation matrix of respondents, aiming to condense individuals into distinct groups. Less common due to computational difficulties; cluster analysis is often preferred.
- The researcher must choose the unit of analysis: variables or respondents.
Achieving Data Summarization Versus Data Reduction
- Exploratory factor analysis outcomes: data summarization and data reduction.
- Data summarization: Derives underlying dimensions describing data in smaller concepts.
- Data reduction: Derives empirical values (factor scores) for each dimension, substituting the original values.
Data Summarization
- Fundamental concept: definition of structure.
- Structure allows viewing variables at levels of generalization, from individual variables to grouped variables expressing a concept.
- Exploratory factor analysis differs from dependence techniques; all variables are simultaneously considered.
- Exploratory factor analysis employs variates (linear composite of variables) to maximize explanation of the variable set, not to predict dependent variables.
- Data summarization goal: defining a small number of factors adequately representing the original set of variables.
- Structure is defined by interrelatedness among variables, specifying fewer dimensions (factors).
Data Reduction
- Exploratory factor analysis achieves data reduction by:
- Identifying representative variables from a larger set for subsequent multivariate analyses.
- Creating a new set of variables (factor scores or summated scales) to replace the original set.
- The purpose is to retain the original variables' nature while simplifying subsequent analysis.
- Exploratory factor analysis provides an empirical basis for assessing variable structure and creating composite measures.
- Data summarization focuses on identifying dimensions.
- Data reduction uses factor loadings either to identify variables or to estimate factor scores, which replace original variables in subsequent analyses.
Variable Selection
- Researchers should consider conceptual underpinnings when using exploratory factor analysis for data reduction or summarization.
- Potential dimensions identified are based on variables submitted to exploratory factor analysis.
- Exploratory factor analysis will always produce factors, creating a risk of "garbage in, garbage out."
- Factor quality reflects the conceptual basis of included variables.
- Data summarization relies on a conceptual basis for variables analyzed.
- Even for data reduction, factor analysis is most efficient when conceptually defined dimensions are represented by the factors.
Using Exploratory Factor Analysis with Other Multivariate Techniques
- Factor analysis provides a clear understanding of which variables act in concert.
- Variables within a factor have similar profiles across groups in MANOVA or multiple discriminant analysis.
- Highly correlated variables affect stepwise procedures in multiple regression and multiple discriminant analysis.
- Knowledge of variable structure gives the researcher a better understanding of variable entry reasoning.
- Factor analysis creates new variables incorporating the original variables' nature, reducing problems associated with large numbers or high intercorrelations.
Designing an Exploratory Factor Analysis
- Exploratory factor analysis design involves:
- Calculating input data (correlation matrix).
- Designing the study (number of variables, measurement properties, types of variables).
- Determining sample size.
Correlations Among Variables or Respondents
- Exploratory factor analysis uses a correlation matrix as basic input.
- R-type exploratory factor analysis uses a traditional correlation matrix (correlations among variables).
- Q-type exploratory factor analysis derives correlations between individual respondents, identifying similar individuals.
- Focus will be on R-type exploratory factor analysis (grouping variables).
Variable Selection and Measurement Issues
- The primary requirement is that a correlation value can be calculated among all variables.
- Metric variables are easily measured by several types of correlations.
- Nonmetric variables are more problematic; avoid them if possible.
- If nonmetric variables are included, define dummy variables (coded 0-1).
- If all variables are dummy variables, specialized forms of factor analysis are more appropriate.
Number of Variables to Be Included in an Exploratory Factor Analysis
- Minimize the number of variables while maintaining a reasonable number per factor.
- Include several variables (5+) to represent each proposed factor.
- Exploratory factor analysis identifies patterns among variable groups; it's less useful with single-variable factors.
- Key variables should closely reflect hypothesized underlying factors.
Sample Size
- The minimum sample size for factor analysis is 50 observations, preferably 100 or larger.
- The general rule is to have at least 5 times as many observations as variables.
- An acceptable sample size would have a 10:1 ratio.
- Some suggest a minimum of 20 cases per variable.
- As the number of variables increases, the probability of spurious correlations increases.
- With 30 variables, 435 correlations are computed.
- At an alpha of 0.05, 21 correlations might be significant by chance.
- Maximize the cases-per-variable ratio to minimize overfitting the data, which yields sample specific factors.
- To achieve this, select the most parsimonious set of variables and obtain an adequate sample size.
Conceptual and Statistical Issues
Conceptual Issues
- Basic assumption: structure exists in the selected variables.
- Correlated variables and defined factors do not guarantee relevance.
- Researchers ensure observed patterns are conceptually valid.
- Mixing dependent and independent variables and then using derived factors to support dependence relationships is inappropriate.
- The sample should be homogeneous regarding factor structure.
- Applying exploratory factor analysis to a sample of males and females for sex-differing items is inappropriate.
- When differing groups are expected, separate factor analyses should be performed and results compared.
Statistical Issues
- Statistical departures from normality, homoscedasticity, and linearity diminish observed correlations.
- Normality is necessary if a statistical test is applied to the significance of factors, although tests are rarely used.
- Some degree of multicollinearity is desirable to identify interrelated variable sets.
Overall Measures of Intercorrelation
- It is essential to ensure the data matrix has sufficient correlations to justify the application of exploratory factor analysis.
- If all the correlations are low or equal, exploratory factor analysis is inappropriate.
- Visual inspection: if no substantial number of correlations with r > .30, factor analysis is probably inappropriate.
- Partial correlations: high partial correlations suggesting a variable is not correlated with a large number of other variables in the analysis.
- Exception: two highly correlated variables with high loadings may also have high partial correlations.
- SPSS Factor Analysis provides the anti-image correlation matrix, which is the negative value of the partial correlation.
- Bartlett test of sphericity: provides statistical significance that the correlation matrix has significant correlations among at least some of the variables.
- Measure of sampling adequacy (MSA): ranges from 0 to 1, reaching 1 when each variable is perfectly predicted without error by the other variables. Guidelines:
- .80+ = meritorious
- .70+ = middling
- .60+ = mediocre
- .50+ = miserable
- < .50 = unacceptable
- The measure of sampling adequacy increases as:
- the sample size increases
- the average correlations increase
- the number of variables increases, or
- the number of factors decreases
- Always have an overall measure of sampling adequacy value of > .50 before proceeding with exploratory factor analysis.
- Variable-specific measure of sampling adequacy values can identify variables for deletion.
Variable-Specific Measures of Intercorrelation
- In addition to visual examination, the measure of sampling adequacy guidelines are extended to individual variables.
- Examine the measure of sampling adequacy values for each variable and exclude those falling in the unacceptable range.
- First, delete the variable with the lowest measure of sampling adequacy and then recalculate the exploratory factor analysis.
- Repeat until all variables have an acceptable measure of sampling adequacy value.
- Then, the overall measure of sampling adequacy can be evaluated.
Deriving Factors and Assessing Overall Fit
- After variables are specified and the correlation matrix is prepared, the researcher is ready to apply exploratory factor analysis.
- Decisions must be made concerning:
- the method of extracting the factors (common factor analysis versus components analysis), and
- the number of factors selected to represent the underlying structure in the data.
Selecting the Factor Extraction Method
- Researchers can choose between two similar methods for defining (extracting) the factors: common or components.
- This decision must consider the objectives of the factor analysis and knowledge about variable relationships.
Partitioning the Variance of a Variable
- To select between methods, understand the variance for a variable and how it is partitioned.
- Variance represents the total amount of dispersion of values for a single variable about its mean.
- When a variable is correlated, it shares variance with the other variable.
- In exploratory factor analysis, variables are grouped by their correlations, so it's important to understand shared versus unexplained variance.
- The total variance of any variable can be divided (partitioned) into three types of variance:
- Common variance: the variance in a variable that is shared with all other variables in the analysis (a variable's communality).
- Specific variance: the variance is associated with only a specific variable.
- Error variance: variance that cannot be explained by correlations and is due to unreliability in measuring the data.
- Total variance of any variable = common + unique + error variances.
Common Factor Analysis Versus Component Analysis
- Based on:
- the objectives of the exploratory factor analysis, and
- the amount of prior knowledge about the variance in the variables.
- Component analysis: used when the objective is to summarize most of the original information (variance) in a minimum number of factors for prediction purposes.
- Common factor analysis: used primarily to identify underlying factors or dimensions that reflect what the variables share in common.
- Component analysis (principal components analysis) is used when:
- Data reduction is a primary concern.
- Specific and error variance represent a small proportion of the total variance.
- Common factor analysis is used when:
- The objective is to identify the latent dimensions or constructs represented in the original variables.
- Little knowledge exists about specific and error variance, and the researcher wishes to eliminate it.
- Common factor analysis is viewed as more theoretically based due to its restrictive assumptions.
- However, common factor analysis has several problems:
- factor indeterminacy, which means that for any individual respondent, several different factor scores can be calculated from a single factor model result.
- calculation of the estimated communalities used to represent the shared variance.
- Research demonstrates similar results are obtained for both component and common factor analysis when the number of variables exceeds 30 or the communalities exceed 0.60 for most variables.
- By examining the unrotated factor matrix, the researcher can explore the potential for data reduction and obtain a preliminary estimate of the number of factors to extract.
Criteria for the Number of Factors to Extract
- Both factor analysis methods are interested in the best linear combination of variables.
- The first factor is the single best summary of linear relationships.
- The second factor is the second-best linear combination, orthogonal to the first factor, derived from remaining variance.
- The goal is to retain or use a small number of variables and still adequately represent the entire set of variables.
- The key question is: How many factors to extract or retain?
- How many factors should be in the structure? How many factors can be reasonably supported?
Latent Root Criterion
- The rationale for the latent root criterion is that any individual factor should account for the variance of at least a single variable if it is to be retained for interpretation.
- Factors having latent roots or eigenvalues > 1 are considered significant; all factors with latent roots < 1 are considered insignificant.
- Using the eigenvalue for establishing a cutoff is most reliable when the number of variables is between 20 and 50.
- If the number of variables is <20, the tendency is for this method to extract a conservative number of factors (too few).
- If > 50 variables are involved, it is not uncommon for too many factors to be extracted.
A Priori Criterion
- The a priori criterion can be applied when the number of factors to be extracted are known before performing the exploratory factor analysis.
- The analysis is instructed to stop when the desired number of factors has been extracted.
- This approach is useful when testing a theory or hypothesis about the number of factors to be extracted.
- It also can be justified in attempting to replicate other research and extract the same number of factors that was previously found.
Percentage of Variance Criterion
- This approach is based on achieving a specified cumulative percentage of total variance extracted by successive factors.
- The purpose is to ensure practical significance for the derived factors by ensuring that they explain at least a specified amount of variance.
- In the natural sciences the factoring procedure usually should not be stopped until the extracted factors account for at least 95% of the variance.
- In the social sciences, it is not uncommon to consider a solution that accounts for 60% of the total variance (and in some instances even less) as satisfactory.
Scree Test Criterion
- The scree test is used to identify the optimum number of factors that can be extracted before the amount of unique variance begins to dominate the common variance structure.
- Derived by plotting the latent roots against the number of factors in their order of extraction, and the shape of the resulting curve is used to evaluate the cut-off point.
- The point at which the curve first begins to straighten out is considered to indicate the maximum number of factors to extract.
Heterogeneity of the Respondents
- If the sample is heterogeneous with regard to at least one subset of the variables, then the first factors will represent those variables that are more homogeneous across the entire sample.
- When the objective is to identify factors that discriminate among the subgroups of a sample, the researcher should extract additional factors beyond those indicated by the methods above.
Interpreting a Factor Matrix
- The task of interpreting a factor-loading matrix to identify the structure among the variables involves a five-step process to objectively work through loadings:
- Examine the factor matrix of loadings
- Identify significant loadings
- Assess the communalities of the variables
- Respecify the model
- Label the factors
Examine the Factor Matrix of Loadings
- The factor-loading matrix contains the factor loading of each variable on each factor.
- They may be either rotated or unrotated loadings.
- The factors are arranged in the output as columns, with each column of numbers representing the loadings of a single factor.
- If an oblique rotation has been used, two matrices of factor loadings are provided:
- Factor pattern matrix, which has loadings that represent the unique contribution of each variable to the factor
- Factor structure matrix, which has simple correlations between variables and factors, but these loadings contain both the unique variance between variables and factors and the correlation among factors.
Identify Significant Loadings
- The interpretation should start with the first variable on the first factor and move horizontally from left to right, looking for the highest loading for that variable on any factor.
- When the highest loading is identified, it should be marked if it is significant.
- This procedure should continue for each variable until all variables have been reviewed for their highest loading on a factor.
- Evaluate the factor matrix by underlining all significant loadings for a variable across all the factors.
- When a variable is found to have more than one significant loading, it is termed a cross-loading.
- The objective is to minimize the number of significant loadings on each row of the factor matrix.
- To do this, you may need to employ different rotation methods to identify a method that eliminates any cross-loadings and thus define a simple structure.
- If a variable persists in having cross-loadings, then the most likely solution is to delete the variable from the analysis.
Assess the Communalities of the Variables
- Identify any variables that are not adequately accounted for by the factor solution.
- Identify any variable(s) lacking at least one significant loading.
- Examine each variable's communality, representing the amount of variance accounted for by the factor solution for each variable.
- View the communalities to assess whether the variables meet acceptable levels of explanation.
Respecify the Model
- A variable has no significant loadings.
- Even with a significant loading, a variable's communality is deemed too low; or a variable has a cross-loading.
- The option is to take any combination of the following remedies (from least to most extreme):
- Ignore those problematic variables and interpret the solution as is, which is appropriate if the objective is solely data reduction, but you must still note that the variables in question are poorly represented in the factor solution.
- Evaluate each of those variables for possible deletion, depending on the variable's overall contribution to the research as well as its communality index.
- Employ an alternative rotation method, particularly an oblique method if only orthogonal methods had been used.
- Decrease/increase the number of factors retained to see whether a smaller/larger factor structure will represent those problematic variables.
- Modify the type of factor model used (component versus common factor) to assess whether varying the type of variance considered affects the factor structure.
Label the Factors
- When an acceptable factor solution has been obtained in which all variables have a significant loading on a factor, the researcher then attempts to assign some meaning to the pattern of factor loadings.
- Variables with higher loadings are considered more important and have greater influence on the name or label selected to represent a factor.
- The final result will be a name or label that represents each of the derived factors as accurately as possible.
Validation of Exploratory Factor Analysis
- The final stage of undertaking an exploratory factor analysis is assessing the degree of generalizability of the results to the population and the potential influence of individual cases or respondents on the overall results.
Use of a Confirmatory Perspective
- The most direct method of validating the results is to move to a confirmatory analysis and assess the replicability of the results, either with a split sample in the original data set or with a separate sample.
Assessing Factor Structure Stability
- Another aspect of generalizability is the stability of the factor model results.
- Factor stability is primarily dependent on the sample size and on the number of cases per variable.
- Comparison of the two resulting factor matrices will provide an assessment of the robustness of the solution across the sample.
Steps in Construct Development and Scale Development for Confirmatory Factor Analysis (CFA)
1. Defining individual constructs
2. Developing the overall measurement model
3. Designing a study to produce empirical results
4. Assessing measurement model validity
Defining individual constructs
The process begins by listing the constructs that will comprise the measurement model. In some cases, a scale that was previously used can be applied again, otherwise you can use previously validated scales for the construct of interest. When a previously applied scale is not available, the researcher may have to develop a new scale. The process of designing a new construct measure involves a number of steps through which the researcher translates the theoretical definition of the construct into a set of specific measured variables. As such, it is essential that the researcher consider not only the operational requirements (e.g., number of items, dimensionality) but also establish the construct validity of the newly designed scale. Although designing a new construct measure may provide a greater degree of specificity, the researcher must also consider the amount of time and effort required in the scale development and validation process.
Developing the overall measurement model
In this stage, the need is to consider how all of the individual constructs will come together to form an overall measurement model. There are several key issues to this consideration:
Unidimensionality
Unidimensional measures mean that a set of measured variables (indicators) can be explained by only one underlying construct. Unidimensionality becomes critically important when more than two constructs are involved. In such a situation, each measured variable is hypothesized to relate to only a single construct. All cross-loadings are hypothesized to be zero when unidimensional constructs exist.
Congeneric Measurement Model
A measurement model is constrained by the model hypotheses. The constraints refer specifically to the set of fixed parameter estimates. One type of common constraint is a measurement model hypothesised to consist of several unidimensional constructs with all cross-loadings constrained to zero. In addition, when a measurement model also hypothesises no covariance between or within construct error variances, meaning they are all fixed at zero, the measurement model is said to be congeneric.
Items per Construct
Good practice dictates a minimum of 3 items per factor, preferably 4, not only to provide minimum coverage of the construct's theoretical domain, but also to provide adequate identification for the construct. Assessing construct validity of single item measures is problematic. When single items are included, they typically do not represent latent constructs.
Reflective Versus Formative Constructs
The issue of causality affects measurement theory. Typically, in Psychology we study latent factors thought to cause measured variables. However, there are times when the causality may be reversed. The contrasting direction of causality leads to different measurement approaches: reflective versus formative measurement models.
- Reflective measurement theory is based on the idea that latent constructs cause the measured variables and that the error results in an inability to fully explain these measured variables. Hence, the arrows are drawn from latent constructs to the measured variables.
- Formative measurement theory is modelled based on the assumption that the measured variables cause the construct. The error in formative measurement models, therefore, is an inability of the measured variables to fully explain the construct. A key assumption is that formative constructs are not considered latent. Instead, they are viewed as indices where each indicator is a cause of the construct.
Designing a study to produce empirical results
The third stage involves designing a study that will produce confirmatory results. Initial data analysis procedures should first be performed to identify any problems in the data, including issues such as data input errors. After conducting these preliminary analyses, the researcher must make some key decisions on designing the CFA model.
Measurement scales in CFA
CFA models typically contain reflective indicators measured with an ordinal or better measurement scale (ie interval/ratio). Indicators with ordinal responses of at least four response categories can be treated as interval, or at least as if the variables are continuous.
CFA and sampling
CFA in most cases will require the use of multiple samples. Testing measurement theory generally requires multiple studies and/or samples. An initial sample can be examined with exploratory factor analysis and the results used for further purification.
Specifying the model
CFA, not exploratory factor analysis, should be used to test the measurement model. Exploratory factor analysis provides insight into the structure of the items and may be helpful in proposing the measurement model, but it does not test a theory.
Issues in identification
Once the measurement model is specified, the researcher must revisit the issues relating to identification of the overall model. Overidentification is the desired state for CFA models in general. During the estimation process, the most likely cause of SPSS producing meaningless results is a problem with statistical identification.
Meeting the Order and Rank Conditions
The order and rank conditions are the required mathematical properties for identification.
- Order condition = the requirement that the degrees of freedom for a model be > 0.
- Rank condition = the requirement that each parameter be estimated by a unique relationship (equation). In exploratory factor analysis models, breaches of rank condition are most likely in the presence of cross-loading of items and/or in correlated error terms – hence you should avoid using either cross-loadings or correlated errors in your exploratory factor analysis models.
Three-indicator rule
Given the difficulty in establishing the rank condition, most use more general guidelines, which includes the three-indicator rule.
- The three-indicator rule is met when all factors in a congeneric model have at least three significant indicators.
- A two-indicator rule also states that a congeneric factor model with two significant items per factor will be identified as long as each factor also has a significant relationship with some other factor.
Several options are possible when Heywood cases arise.
- ensure construct validity – which may involve the elimination of an offending item
- try and add more items if possible or assume tau-equivalence (all loadings in that construct are equal)
- ''last resort" solution, which is to fix the offending estimate to a very small value, such as .005
Assessing measurement model validity
Assessing fit
The sample data are represented by a covariance matrix of measured items, and the theory is represented by the proposed measurement model. Fit compares the two covariance matrices. Guidelines for good fit apply. The result is that confirmatory factor analysis enables us to test or confirm whether a theoretical measurement model is valid.
Path estimates
One of the most fundamental assessments of construct validity involves the measurement relationships between items and constructs (i.e., the path estimates linking constructs to indicator variables). When testing a measurement model, the researcher should expect to find relatively high loadings.
CFA models also typically display the squared multiple correlations for each measured variable which represents the extent to which a measured variable's variance is explained by a latent factor. From a measurement perspective, it represents how well an item measures a construct. Squared multiple correlations are sometimes referred to as item reliability, communality, or variance extracted.
Standardised loadings of at least .5 and ideally .7 or higher confirm that the indicators are strongly related to their associated constructs and are one indication of construct validity.
Construct validity
Validity is defined as the extent to which research is accurate. CFA eliminates the need to summate scales because latent construct scores are computed for each participant. This process allows relationships between constructs to be automatically corrected for error variance that exists in the construct measures.
Convergent validity
The items that are indicators of a specific construct should converge or share a high proportion of variance in common, known as convergent validity.
Methods to estimate the relative amount of convergent validity among item measures in CFA include:
- Factor Loadings.
- Average Variance Extracted.
- Reliability is also an indicator of convergent validity.
Discriminant validity
Discriminant validity is the extent to which a construct is truly distinct from other constructs. Thus, high discriminant validity provides evidence that a construct is unique and captures some phenomena other measures do not. CFA provides 2 ways to assessing discriminant validity.
- The correlation between any two constructs can be specified (fixed) as equal to one.
- A more rigorous test is to compare the average variance-extracted values for any two constructs with the square of the correlation estimate between these two c The variance extracted estimates should be greater than the squared correlation estimate.
Model diagnostics
CFA's ultimate goal is to obtain an answer as to whether a given measurement model is valid. But the process of testing using CFA provides additional diagnostic information that may suggest modifications for either addressing unresolved problems or improving the model’s test of measurement theory.
Multiple diagnostic cues are provided when using CFA:
- Standardised residuals: are simply the raw residuals divided by the standard error of the residual.
- Modification indices: is calculated for every possible relationship that is not estimated in a model.
- Specification searches: is an empirical trial-and-error approach that uses model diagnostics to suggest changes in the model.