Validity

Reliability of a measure is its consistency, result in reliable and replicable studies
Replicability is made more likely using clear operational definitions and appropriate sampling techniques

Sampling procedures

Simple random sampling = every individual in the population could be included (including one person does not change the selection of another person )
Systematic sampling = individuals drawn from the population using fixed interval frame (example every 20^th person)
Stratified sampling = used for heterogeneous populations (large degree of variation in traits) in order to represent differences proportionally
Cluster sampling = used if you don’t have sampling frame or if population is widely dispersed but may be subdivided into clusters
Convenience sampling = recruiting participants (example friends and family) from wherever possible, not randomly or according to a system
Volunteer sampling = relying on volunteers to come forward following advertisement, elaborated on using snowball sampling (participants help researchers find new subjects)
Selection bias examples

Validity

A measure is neither valid or invalid, concerns the interpretations and uses of a measure's scores
Validity needs to be determined in relation to purpose
Cannot be a single statistic for validity
A process of building up evidence about what we can and cannot infer from scores

Principle of situational specificity = measures need to be continually checked for their validity as situations of use differ and change
Reliability v validity

Reliability is a quantitative property of test responses but validity is a property of the interpretation of test scores
If a measure is not reliable it cannot be valid, but a measure can be reliable but not valid (reliability is a necessary but not sufficient criteria for validity )

Content validity = measure based on evaluation of subjects, topics, content covered by items in the test (Evidence based on test content)
Criterion related validity = measure of validity obtained by evaluating the relationship of scores obtained on the test to scores on other tests or measures (evidence based on relations with other variables)
Construct validity = measure that is arrived at by executing a comprehensive analysis of how scores relate to other test and measure scores/how scores on test can be understood within some theoretical framework for understanding the construct that test was designed to measure (evidence based on internal structure)
While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

example is there a logical connection between the items on questionnaire and the construct
If a test appears to measure what it measures 'on the face of it' could have high in face validity
More relevant in earlier stages

Content validity = a judgement of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample

Does the content adequately reflect the construct being assessed? To what extent doe sit cover diverse features and characteristics of the construct
Requires expert knowledge or in depth research (example provide theoretical rationale and lit review/sample is sufficient representativeness and appropriate difficulty range/description of measures/evaluation of items by independent experts

Factor analysis

Ceiling effects = ceiling effects occur when a considerable percentage of participants score the best or maximum possible score
Floor effects = floor effects occur when the opposite happens, i.e., a considerable percentage of participants obtain the worst or minimum available score.
Factor analysis = examines patterns/relationships for large number of variables, determines whether the info can be condensed or summarized in a smaller set of factors or components

Group patterns of correlation between items on test and together
Designed to determine if there is a general factor behind these correlations
Example is there a construct that can be attributed to relationship of results on various IQ tests
Factor analysis allows to group items in a single test together
Can be used to form coherent subsets or subscales

Type of factor analysis = exploratory or confirmatory
Suitability of data = distribution, interval, ratio
Sample size = case: variable ratio
Items sufficiently correlated = correlation matrix/Barlett's test of sphericity – significance(tests null hypothesesis that there are no relationships between variables/measure of sampling adequacy: Overall and per item(Kaiser Meyer Olkin level)

Examine correlation matrix = must be many correlations >0.3(see why items do not correlate well, items with strong correlation might measure the same thing, low correlation may not be a construct of interest )
Bartlett's test significance = tests to see if correlation matric sufficiently departs from an Identity Matrix – no correlations
(Kaiser Meyer Olkin level) KMO = the closer to 1 the more appropriate it is for factor analysis (typically > 0.7)
Individual measure of sampling adequacy MSA values need to be above 0.7= measures strength of correlation with other matrix items
Validity
- Validity = refers to the construct measured
- Content validity = evidence based on test content
1. A judgement of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample
2. Does content reflect construct being assessed? How diverse are features and characteristics
3. Requires a degree of expert knowledge of construct or in depth research
4. Provides theoretical rationale and lit review, ensure item sample is representative, description of scales, evaluation of items by experts
- Criterion related validity = evidence based on relations with other variables
- Construct validity = evidence based on internal structure