Validity

  • Reliability of a measure is its consistency, result in reliable and replicable studies  

  • Replicability is made more likely using clear operational definitions and appropriate sampling techniques  

 

Sampling procedures  

  • Simple random sampling = every individual in the population could be included (including one person does not change the selection of another person ) 

  • Systematic sampling = individuals drawn from the population using fixed interval frame (example every 20th person) 

  • Stratified sampling = used for heterogeneous populations (large degree of variation in traits) in order to represent differences proportionally 

  • Cluster sampling = used if you don’t have sampling frame or if population is widely dispersed but may be subdivided into clusters 

  • Convenience sampling = recruiting participants (example friends and family) from wherever possible, not randomly or according to a system 

  • Volunteer sampling = relying on volunteers to come forward following advertisement, elaborated on using snowball sampling (participants help  researchers find new subjects) 

  • Selection bias examples  

  1. Those at home at a particular time (when the survey occurs) 

  2. Those who respond to the survey (conscientious) 

  3. Those who volunteer to take part in a study (motivated) 

 

Validity  

  • Validity = refers to the nature of the construct measured  

  1. A measure is neither valid or invalid, concerns the interpretations and uses of a measure's scores 

  2. Validity needs to be determined in relation to purpose 

  3. Cannot be a single statistic for validity 

  4. A process of building up evidence about what we can and cannot infer from scores  

  • Principle of situational specificity = measures need to be continually checked for their validity as situations of use differ and change 

  • Reliability v validity  

  1. Reliability is a quantitative property of test responses but validity is a property of the interpretation of test scores  

  2. If a measure is not reliable it cannot be valid, but a measure can be reliable but not valid (reliability is a necessary but not sufficient criteria for validity ) 

  • Concept of validity categories  

  1. Content validity = measure based on evaluation of subjects, topics, content covered by items in the test (Evidence based on test content) 

  2. Criterion related validity = measure of validity obtained by evaluating the relationship of scores obtained on the test to scores on other tests or measures (evidence based on relations with other variables) 

  3. Construct validity = measure that is arrived at by executing a comprehensive analysis of how scores relate to other test and measure scores/how scores on test can be understood within some theoretical framework for understanding the construct that test was designed to measure (evidence based on internal structure) 

  4. While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something. 

  • Face validity = judgment concerning how relevant the test items appear to be  

  1. example is there a logical connection between the items on questionnaire and the construct 

  2. If a test appears  to measure what it measures 'on the face of it' could have high in face validity 

  3. More relevant in earlier stages  

  • Content validity = a judgement of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample  

  1. Does the content adequately reflect the construct being assessed? To what extent doe sit cover diverse features and characteristics of the construct 

  2. Requires expert knowledge or in depth research (example provide theoretical rationale and lit review/sample is sufficient representativeness and appropriate difficulty range/description of measures/evaluation of items by independent experts 

 

Factor analysis 

  • Ceiling effects = ceiling effects occur when a considerable percentage of participants score the best or maximum possible score 

  • Floor effects = floor effects occur when the opposite happens, i.e., a considerable percentage of participants obtain the worst or minimum available score. 

  • Factor analysis = examines patterns/relationships for large number of variables, determines whether the info can be condensed or summarized in a smaller set of factors or components  

  1. Alt def = defines set of underlying dimensions within a larger group of items  

  2. Data reduction method  

  • Factor analysis =inter-dependence statistical technique  

  1. Group patterns of correlation between items on test and together  

  2. Designed to determine if there is a general factor behind these correlations 

  3. Example is there a construct that can be attributed to relationship of results on various IQ tests 

  4. Factor analysis allows to group items in a single test together  

  5. Can be used to form coherent subsets or subscales  

  • Is it appropriate to do factor analysis  

  1. Type of factor analysis = exploratory or confirmatory 

  2. Suitability of data = distribution, interval, ratio 

  3. Sample size = case: variable ratio 

  4. Items sufficiently correlated = correlation matrix/Barlett's test of sphericity – significance(tests null hypothesesis that there are no relationships between variables/measure of sampling adequacy: Overall and per item(Kaiser Meyer Olkin level) 

  • Steps for assessing if something is suitable for factor analysis 

  1. Examine correlation matrix = must be many correlations >0.3(see why items do not correlate well, items with strong correlation might measure the same thing, low correlation may not be a construct of interest ) 

  2. Bartlett's test significance = tests to see if correlation matric sufficiently  departs from an Identity Matrix – no correlations 

  3. (Kaiser Meyer Olkin level) KMO = the closer to 1 the more appropriate it is for factor analysis (typically > 0.7) 

  4. Individual measure of sampling adequacy MSA values need to be above 0.7= measures strength of correlation with other matrix items  

  5. Validity  

    • Validity = refers to the construct measured  

    • Content validity = evidence based on test content  

    1. A judgement of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample  

    2. Does content reflect construct being assessed? How diverse are features and characteristics 

    3. Requires a degree of expert knowledge of construct or in depth research  

    4. Provides theoretical rationale and lit review, ensure item sample is representative, description of scales, evaluation of items by experts  

    • Criterion related validity = evidence based on relations with other variables  

    • Construct validity = evidence based on internal structure