1/53
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Nominal
Nonparametric. named categories with no implied order. One choice is not more important than the other. “name, gender, race, political affiliation”
Ordinal
Nonparametric. Adds hierarchy to data categories, but do not know how much gap in between answer choices. “how are you feeling?”
Interval
parametric. Able to rank answer choices based on equidistant scaling. “thermostat”
Ratio
parametric. Allows for math to be applied to data. Has a true zero. “Height or weight, pulse rate”
Reliability
Refers to the consistency of test or measure
Validity
Refers to the accuracy of the test or measure. Must have high reliability as well.
Inter-rater reliability
How consistent raters evaluate or judge an IV. Can be impacted by number of judges and judges can impact eachother. Uses Pearson Product Correlation (Pearson’s r)
Test stability. “Test-retest reliability”
Whether the measure is stable over time. Have singers sing multiple times. Can’t retest too soon or too long after last test. Uses Pearson Product Correlation (Pearson’s r)
Internal consistency “Split-half reliability”
How well the items on the measure work together to produce similar scores. Singer scores consistently across multiple music genres. Taylor vs Beyonce singing “Dangerously in Love” and “Style”. Uses Cronbach’s Alpha"
Increase inter-rater reliability by…
Ensure proper training among raters
Increase test stability and internal consistency by…
Have sufficient number of questions, ensure questions are easily understood, increase sample size (only include those who the measure is intended for)
Face validity
See whether “on its face” the items seems like a good translation of the construct
Content validity
Addresses how well test questions match the content or subject area they are intended to assess. Judged by experts in a field.
Predicative validity
How well a certain measure can predict future behavior or performance. SAT —> college GPA
Convergent validity
Degree to which a construct measure “converges” with other measures that should be measuring the same thing. Comparing measures from mental health diagnostics
Discriminant validity
Degree to which a construct measure “diverges” from other measures that purport to be measuring something different. Want to be low.
Statistical significance testing
Communicates PROBABILITY by telling us how likely the current result would be if the study’s null hypothesis were true
Answers the question: “Do we think something happened?”
Quantify whether a result is likely due to chance or due to some factor of interest
Random sampling error
Natural deviations that occur when randomly sampling from the population. Unavoidable but can be measured
Bias
Flawed sampling procedures where researchers do not use a representative sample. Can’t be measured, but can be controlled
Sampling distribution
The distribution of a sample statistic if all possible samples were drawn from a given population
Central Limit Theorem
The mean of the sampling distribution will equal the mean of the population
If the sample size is of sufficient size, the sampling distribution tends to be normal regardless of the shape of the original population distribution
As the sample size increases, the standard deviation of the sample distribution (standard error) decreases
Power
the ability of the test to correctly reject a null hypothesis when it is false; also known as 1-β
Type I Error
when we reject the null hypothesis when it is true; also known as alpha (𝛼) or Level of Significance. Say there is a difference when there actually is not
Type II Error
when we fail to reject a null hypothesis when it is false; also known as beta (β). Say there is no difference when there actually is
Null hypothesis
No difference of significance due to treatment (only due to randomness)
P-value less than or equal to alpha
<0.05 or 0.01. Statistically significant and reject the null hypothesis
P-value greater than alpha
>0.05 or 0.01. Not statistically significant and suggests difference in treatment group may be due to random sampling error and fail to reject null hypothesis.
Statistically Significant vs. Clinically Relevant
By having too big a sample, we are making it very easy to find a statistically significant difference between our groups (i.e., the effect size shrinks considerably). This statistical significance is not necessarily clinically or practically useful
So, statistical significance alone provides limited insight. As best as possible, researchers should interpret statistical significance relative to confidence intervals and effect sizes to understand the full context of the result better.
Effect Sizes
Communicates STRENGTH by telling us the magnitude of the experimental effect, or relationship (i.e., correlation), or odds between variables
Answers the question: “How big (or small) was the effect or relationship?”
Cohen’s d or Hedge’s g (Standardized Mean Difference)
Both estimate the magnitude of standardized differences between two groups means. Hedge’s g however is more appropriate for small sample sizes because it provides a bias correction
Scores range from -1 to 1 with 0 indicating no effect
Pearson’s R or Point-Biserial (Correlation/Association)
Pearson’s R measures the strength of a linear relationship between continuous variables (r).
Scores range from -1 to 1 with 0 indicating no effect.
Odds Ratio (Proportion)
Scores range from 0 to infinity, with 1 indicating no effect
Confidence Interval
Communicates PRECISION by providing a range of plausible values for the population
Refers to the range of expected values of your estimate (usually your effect size) if you re-ran your experiment with a different sample
Researchers commonly use confidence levels of either 95% or 99%. CIs however are calculated around the effect size and based upon three factors: sample size, response distribution, and population size.
Typically written as the confidence percentage followed by the estimated lower and upper limits of the parameter. For example: an odds ratio of 7.5 might have a confidence interval associated with it such as “95% CL [5.32, 10.45].”
If CI does not include 0;0;1, can assume there is significance
Grounded theory
What theory uses interviews and focus groups to develop a theory based in fieldwork?
Phenomenology
What is the meaning, structure. and essence of the lived experience for people/group?
Ethnography
What is the culture of this group? Researcher inserts themselves into the population
Transferability
Findings have applicability in other contexts
Confirmability
Findings are shaped by subjects, not researcher bias, motivation, or interest. Matches with internal validity
Transferability
Findings have applicability in other contexts. Matches with external validity
Dependability
Findings are consistent and able to be repeated. Matches with construct validity
Credibility
Results are truthful and trustworthy. Has no match
Interviews
Focus on subject’s POV to understand their experiences
Focus Groups
Involves more than one, usually at least four interviewees
Individuals discuss experiences as members of a group to encourage more discussion
Min: 4 Max: 10 Goal: 6-8
Henry Beecher “Journal of Medicine”
Forced research community to look at ethical research
Tuskegee experiement
Broke public’s trust of human research
National research act
Established commission to identify basic ethical principles necessary for human subject research and the IRB
Respect for persons
Informed consent; privacy
Beneficence
Risk/benefit assessment; minimize risks
Justice
Selection of subjects (protect susceptible populations)
IRB
Approves experiments on human research subjects
Exempt review
Lowest possible level of risk; anonymity is retained
Expedited review
Minimal level of risk (risk is not any greater than it would be in activities of daily life); anonymity is not retained
Full board/committee review
More than minimal level of risk; anonymity is not retained
Informed Consent
Protects human speech and provides subjects with all information necessary to reach decision as whether to participate in a study or not
Need for every experiment