1/875
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Psychometrics
The science that underlies psychological and educational measurement.
Reliability
The degree to which test scores are dependable and consistent.
Validity
Most important quality in testing; valid tests measure what they claim to measure.
Test
Any standardized measurement procedure that leads to a numerical score.
Cognitive or performance test
Assess intelligence, academic skills, neuropsychological functioning, and speech and language development.
Measures of maximal performance
Examinee is required to give their best performance, typically involving tasks with right or wrong answers.
Measures of typical performance
Scales for measurement of personality traits and psychological problems, typically structured questionnaires.
Diagnostic test
Both measures of maximal and typical performance are considered diagnostic tests as they are used for clinical purposes.
Subtests
Own set of items with separate numerical scores.
Composite/Index Score
Performance totaled across multiple subtests.
Subscales
Ratings scale and personality questionnaire often have item grouping; generate own scores.
Examinee/Client/Student
The individual taking the test.
Examiner/Evaluator/Clinician/Practitioner
The individual administering the test.
Diagnosis
Applying a formal clinical diagnostic label.
Screening
Determining who needs a more thorough evaluation.
Identification
Other types of classification, such as special ed, gifted, high risk for suicide, etc.
Progress Monitoring
Determining if an examinee's skills or traits are changing over time.
Diagnostic tests
Assessment tools that gather information; decisions should not be made solely on test scores.
Sample
Group of people for whom we have direct data (took a test, participated in study or test development)
Population
Much larger group, all people who the sample is designed to represent
Descriptive statistics
Aim to describe a sample
Inferential Statistics
Tells us how confident we can be in making inferences about a population based on a sample
Univariate Statistics
When descriptive statistics are only about a single variable (classroom tests)
Sample Size
n or N
Measure of Central Tendency
Describe a sample using a single value to present the entire sample, serve as a sample against which to judge any particular person's performance
Mean
Most common type of average, totaling all the scores in the sample and dividing by the number of scores
Median
Middle value in a set of test scores ordered from lowest to highest, 50% above and 50% below it
Mode
Most frequent score in a sample
Measures of dispersion
Quantify how variable a set of scores are
Range
Simplest, the difference between the highest and lowest scores in the set
Variance
Index of how far the difference scores fall away from the mean, calculated by deviation scores (each score minus the mean; each deviation score is squared, totaled up, that total is divided by the number of scores in the set)
Standard Deviation (SD)
The square root of the variance; tells us how far away from the mean it is typical for a randomly picked score to fall
Frequency Distribution
A method of organizing data that shows how often each score or category occurs in a dataset
Histogram
Graph that shows how frequently scores occur at each level of a continuous variable
Normal Distribution
Most scores cluster around the middle, with fewer at the extremes; the mean, median, and mode are the same located at the peak of the curve
Empirical Rule (68-95-99.7 Rule)
In a normal distribution: 68% of scores fall within ±1 SD, 95% fall within ±2 SD, and 99.7% fall within ±3 SD
Z-Scores
Shows how many standard deviations a score is from the mean; positive z = above average, negative z = below average
Univariate Descriptive Statistics
Describes one variable, including central tendency, variability, frequency distribution
Bivariate Descriptive Statistics
Describes the relationship between two variables; foundational for psychometrics
Correlation Coefficient (r)
A statistic (Pearson r) showing the linear relationship between two variables
Regression Line
Also called the line of best fit; minimizes squared vertical distances to all points
r² (Coefficient of Determination)
Proportion of variance in outcome explained by predictor
Statistical Significance (p-values)
Probability of getting the sample's r if population r = 0; p < .05 indicates statistical significance
Multiple Regression
Used when multiple predictors are used to predict one outcome
Effect Size for Group Differences: Cohen's d
Standardized mean difference; measures how far apart two group means are in SD units
Degree of Match Score
Indicates a client's compatibility with different jobs based on a personality test.
Interpretation of r = .60
Indicates a strong positive linear relationship between degree-of-match score and job satisfaction.
Variance Explained (r²)
The proportion of variance in one variable that can be explained by another variable.
r² = .36
Means 36% of the variance in job satisfaction is explained by the degree-of-match score.
Predictive Use of r = .60
Clients with higher match scores are more likely to be satisfied in jobs that align with those scores.
Statistical Significance (p < .01)
Indicates the relationship is unlikely to be due to chance, suggesting meaningful results.
Practical Interpretation
The test is valid for predicting job satisfaction, but other factors also influence outcomes.
Cohen's d
A measure of effect size that indicates the standardized difference between two means.
Interpretation of d = 0.39
Indicates a small-to-moderate effect size, with girls scoring higher than boys.
p-value = .08
Indicates an 8% chance of observing a difference as large as the one in the sample, not statistically significant.
Effect Size vs. Statistical Significance
Do not equate 'not significant' with 'no effect'; the effect size shows a real-world difference.
Empirical Rule
Describes how data is distributed in a normal distribution: ~68%, ~95%, ~99.7% within certain standard deviations.
Example of Empirical Rule
If M = 100, SD = 15, then ~68% of scores are between 85 and 115.
Reported Correlation
A correlation r indicates the relationship between test scores and another variable.
Statistical Significance of Correlation
p-value indicates the likelihood that the observed correlation is due to chance.
Pooled Standard Deviation
Used when calculating Cohen's d for groups with different standard deviations.
Cohen's d Interpretation Guidelines
0.2 = small, 0.5 = medium, 0.8 = large.
Practical Importance of d
Consider the context and field norms when interpreting effect sizes.
Validity Evidence
Indicates how well a test measures what it claims to measure.
Conclusion on Test Validity
Validity evidence suggests the test is useful for its applied purpose.
Sample Size (n)
The number of observations or subjects in a study, influencing statistical significance.
Statistical Power
The probability that a statistical test will correctly reject a false null hypothesis.
Confidence Intervals
A range of values that is likely to contain the population parameter with a certain level of confidence.
Reporting Findings
Include means, standard deviations, effect sizes, and significance levels in a concise format.
Group Differences
Comparisons between different groups, often analyzed using effect sizes like Cohen's d.
Effect Size Reporting
Always report both the p-value and effect size for a comprehensive understanding of results.
Cautions in Interpretation
Be cautious in interpreting results, especially with small sample sizes or non-significant findings.
What is the purpose of multiple regression?
To investigate the relationships between multiple predictors and a particular outcome.
What does the multiple correlation coefficient (R) indicate?
It indicates the strength of the relationship between the predictors and the outcome.
What does the squared multiple correlation coefficient (R²) represent?
It represents the proportion of variability in the outcome explained by the set of predictors.
What happens to R² as more predictors are added?
R² continues to grow until it approaches 1.0, indicating that 100% of the variability in the outcome is accounted for.
What are standardized regression coefficients also known as?
Beta weights.
What do beta weights indicate?
The strength of the relationship between each predictor and the outcome when controlling for other predictors.
In the context of reading comprehension, what are two predictors mentioned?
Oral reading speed and listening comprehension skills.
What does it mean if adding a second predictor does not significantly change R²?
It may indicate that the second predictor is not worth measuring.
What is incremental validity?
The ability of a measure to add unique information beyond what is provided by other measures.
Why is multiple regression analysis useful for practitioners?
It helps identify which predictors are important and which measures may be superfluous.
What is an example of a situation where multiple regression could be applied?
Predicting children's reading comprehension skills using oral reading speed and listening comprehension.
What is the significance of controlling for other predictors in regression analysis?
It allows for a clearer understanding of the unique contribution of each predictor.
What does a high b value for a predictor indicate?
A strong relationship between that predictor and the outcome.
What is a potential limitation of using multiple regression?
In practice, only a few predictors are typically used at the same time.
What is the relationship between predictors and outcomes in multiple regression?
Predictors are used to explain variability in the outcome.
How can multiple regression help in test development?
It can identify which tests or measures are necessary and which can be omitted.
What is the role of listening comprehension in predicting reading comprehension?
It serves as an additional predictor that may enhance the prediction of reading comprehension.
What does it mean if a predictor is considered superfluous?
It does not significantly contribute to predicting the outcome.
What is the main focus of Chapter 5 mentioned in the text?
Further discussion on incremental validity.
What type of analysis is used to determine the importance of predictors?
Multiple regression analysis.
What is the primary focus of psychometric research regarding test scores?
Examining group differences in test scores.
What should a new measure of depression symptoms yield for individuals with a clinical diagnosis?
Higher scores than those in nondiagnosed individuals.
What might researchers analyze to understand the impact of test scores on treatment for minority groups?
Ethnic group differences in test scores.
What type of statistics are commonly used to analyze group differences?
Inferential statistics.
What does a p-value indicate in the context of group differences?
The likelihood of a group difference occurring by chance, assuming no actual difference in the population.
How does sample size affect the significance of p-values for group differences?
Even small group differences can be statistically significant in large samples.
What is Cohen's d?
An effect-size statistic that measures the standardized mean difference between two groups.
How is Cohen's d calculated?
By dividing the difference between group means by the standard deviation of the groups.