Introduction to Statistics - Flashcards
Independent and Dependent Variables
- In the sports psychology example, students were randomly assigned to a new program group or a no-program group.
- Independent variable: the manipulated variable that defines the groups. Answer from the slide: a) the program type (new or none).
- The dependent variable: the outcome measured to assess the effect of the manipulation. Answer from the slide: c) playing ability.
- A potential confounder shown in the scenario: later reports that students began consuming more fruits and vegetables, which could also influence playing ability. This highlights the importance of identifying confounding variables in experimental design and the value of randomization and control groups to isolate the effect of the independent variable.
Measurement Scales and Likert Scales
- Likert scale (Strongly Disagree to Strongly Agree) is used to measure attitudes/agreements.
- Scale of measurement: ordinal (distinct ordered categories; intervals between categories are not guaranteed to be equal).
- Evidence from the slides:
- “Strongly Disagree; Disagree; Neutral; Agree; Strongly Agree” with values 1–5, described as ordinal.
- Distinctions to remember:
- Ordinal: ordered categories, but not equal intervals (e.g., Likert).
- Nominal: categories with no inherent order (e.g., gender, political party).
- Interval: numeric scale with equal intervals but no true zero (e.g., some temperature scales in certain contexts; in the slides, SAT subtest treated as interval).
- Ratio: numeric scale with equal intervals and a true zero (e.g., time, counts, heights).
Parameter vs Statistic; Population vs Sample; Sampling Error
Population mean is a parameter, denoted by
Sample mean is a statistic, denoted by
When you compute the average GRE score for the entire cohort (530), that is a population parameter if the cohort is the whole population of interest; when you compute the average from a sample (e.g., 20 students) and report 525, that is a statistic.
Concept: sampling error arises because a sample may not perfectly represent the population.
- Example from slides: Population average GRE score = , Sample average score = . The difference: represents sampling error.
Formula for the idea of estimation:
- Population parameter: (or other parameters like \sigma for population standard deviation)
- Sample statistic: (sample mean and sample standard deviation)
Practical significance: inferential statistics use samples to make inferences about population parameters.
Continuous vs Discrete Variables; Examples
- Continuous variable: numeric values that can take any value within a range (including decimals).
- Example: The number of minutes it takes to solve a math problem. Since time can be measured with fractions (e.g., 12.37 minutes), it is continuous.
- Also noted: minutes can be broken into seconds, milliseconds, etc.
- Discrete variable: numeric values that typically take whole-number values (counts).
- Example: The number of times a person orders via food delivery apps per week (1, 2, 3, …).
- In one slide, the following was used to illustrate a continuous variable: minutes to solve a math problem; in another slide, the number of times a person orders (a discrete count).
Scales of Measurement with Examples
- Number of lever presses per minute
- Scale: ratio (true zero exists: 0 presses means none; equal intervals; meaningful ratios).
- Order of finish in a marathon
- Scale: ordinal (order matters; intervals between places are not necessarily equal).
- Political party affiliation
- Scale: nominal (categories with no inherent order).
Practice Problem Set — Quick Answers and Rationale
- Practice Problem 1 (Texting habits; population, sample, method)
- Population: undergraduate students in Canada
- Sample: 100 students from York University who participated
- Research method: Non-experimental (observational) because there is no manipulation of variables; the researcher observes texting behavior as it occurs.
- Practice Problem 2 (Scales of measurement)
- Number of lever presses per minute → Ratio
- Order of finish in a marathon → Ordinal
- SAT math subtest scores (200 to 800) → Interval
- Number of lever presses per minute → Ratio
- Order of finish in a marathon → Ordinal
- Political party affiliation → Nominal
- Practice Problem 3 (IQ by degree type)
- Population: All US college students seeking liberal arts or professional degrees
- Sample: 1,200 liberal arts majors and 1,000 professional majors (2,200 total)
- Dependent variable: Intelligence as measured by IQ score
- Independent variable: Degree type (liberal arts vs professional)
- Scale for dependent variable: Interval
- Scale for independent variable: Nominal
- Practice Problem 4 (Noise and memory recall)
- Study type: Experimental (random assignment to conditions: background noise vs silence)
- Independent variable: Presence of background noise (yes/no)
- Dependent variable: Memory recall (number of words correctly recalled)
- Research question: Does background noise during study impair memory recall?
- Research hypothesis: Participants in the background-noise condition will recall fewer words correctly than those in the silent condition.
Practice Problem Details and Solutions (Expanded)
- Practice Problem 1 – Population, Sample, Method
- Population: undergraduate students in Canada.
- Sample: 100 York University students who participated.
- Method: Non-experimental (observational). Rationale: Researchers observe texting habits without manipulating any variable.
- Practice Problem 2 – Scale identification (as listed above)
- Temperature-like scales, etc., are categorized by the four scales: nominal, ordinal, interval, ratio.
- Practice Problem 3 – IQ study details
- Population: All US college students seeking liberal arts or professional degrees.
- Sample: 2,200 students (1,200 liberal arts majors; 1,000 professional majors).
- Dependent variable: IQ score (interval scale).
- Independent variable: Degree type (liberal arts vs professional; nominal).
- Practice Problem 4 – Experimental design and hypotheses
- Experimental study: Yes (randomly assigned to backgrounds noises vs silence).
- Independent variable: Presence of noise.
- Dependent variable: Memory recall (words recalled).
- Research question and hypothesis provided in slides; paraphrased above.
Additional Notes: Interpretation and Real-World Relevance
- Confounding variables: In the program vs no-program example, diet changes (fruits/vegetables) could confound the effect on playing ability if not controlled.
- Random assignment: Strengthens causal inference by helping to equalize confounders across groups.
- Sample vs population in practice: Use sample statistics (e.g.,
) to estimate population parameters (e.g., ) with an understanding of sampling error. - Scale selection matters for analysis: The choice between parametric/tests depends on the scale (e.g., ratio/interval vs ordinal/nominal) and distribution assumptions.
Quick References to Key Formulas and Definitions
- Population mean:
- Sample mean:
- Sampling error:
- Likert scale values: 1, 2, 3, 4, 5 (ordinal)
- Distinctions among scales:
- Nominal: categories with no intrinsic order
- Ordinal: ordered categories, unequal intervals possible
- Interval: equal intervals, no true zero
- Ratio: equal intervals, true zero
- Common examples from slides:
- Average hours spent studying per day: hours (sample statistic)
- GRE cohort: population mean hypothetical ; sample mean example
- IQ scores: lowest possible score mentioned as 40 (used to illustrate scale)
Practical Takeaways
- Always identify the independent and dependent variables in any study.
- Check for potential confounding variables that could bias conclusions.
- Determine the scale of measurement before choosing appropriate statistical analyses.
- Distinguish between population parameters and sample statistics, and recognize sampling error as an inherent part of using samples to infer about populations.