Statistics Lecture Review

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/38

flashcard set

Earn XP

Description and Tags

Flashcards covering key statistical terms, concepts, and methodologies based on the provided lecture notes.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

39 Terms

1
New cards

Statistic

A numerical summary of a sample.

2
New cards

Non-response bias

Occurs when many people chosen for the sample don't respond, and their non-response systematically affects the survey results.

3
New cards

Undercoverage bias

Occurs when some members of the population are inadequately represented in the sample.

4
New cards

Cluster sample

A sampling method where the population is divided into groups (clusters), some clusters are randomly selected, and all individuals within the chosen clusters are surveyed.

5
New cards

Residual

The difference between an observed value and the value predicted by a regression model (Observed Y - Predicted Y).

6
New cards

Outlier

A data point that deviates significantly from other observations.

7
New cards

Influential point

A data point whose removal causes a substantial change in the regression model (e.g., slope or intercept).

8
New cards

Q3 (Third Quartile)

The value below which 75% of the data falls; the 75th percentile.

9
New cards

Treatment (in experiment)

A specific condition applied to the experimental units being studied.

10
New cards

Correlation coefficient

A measure of the strength and direction of a linear relationship between two quantitative variables, denoted by 'r'.

11
New cards

Coefficient of determination (R-squared)

The proportion of the variance in the dependent variable that can be predicted from the independent variable(s). It is the square of the correlation coefficient (r^2).

12
New cards

Slope (in regression)

In a regression equation, it represents the estimated average change in the response variable for every one-unit increase in the explanatory variable.

13
New cards

Randomization (in experiment)

The process of assigning subjects to different treatment groups purely by chance, to reduce bias and ensure groups are comparable.

14
New cards

Right-skewed distribution

A distribution where the tail on the right side is longer than the left side, often resulting in the mean being greater than the median.

15
New cards

Block design

An experimental design where subjects are divided into groups (blocks) based on a shared characteristic, and treatments are randomly assigned within each block.

16
New cards

IQR (Interquartile Range)

A measure of statistical dispersion, calculated as the difference between the third quartile (Q3) and the first quartile (Q1), representing the middle 50% of data.

17
New cards

Double-blind study

An experiment where neither the participants nor the researchers administering the treatments know who is receiving the actual treatment and who is receiving a placebo.

18
New cards

Categorical data

Data that can be divided into groups or categories, rather than measured numerically (e.g., color, gender).

19
New cards

First quartile (Q1)

The value below which 25% of the data falls; the 25th percentile.

20
New cards

Sensitive question

A survey question that may elicit dishonest responses due to personal or social reasons, leading to response bias.

21
New cards

Z-score

A measure of how many standard deviations an element is from the mean.

22
New cards

Residual plot

A scatterplot of the residuals against the explanatory variable, used to check the appropriateness of a linear model.

23
New cards

High leverage point

A data point that has an unusual x-value (explanatory variable value) compared to the rest of the data.

24
New cards

Symmetric distribution

A distribution where data values are distributed equally around the center, often with the mean and median being approximately equal.

25
New cards

Range

The difference between the maximum and minimum values in a dataset.

26
New cards

Left-skewed distribution

A distribution where the tail on the left side is longer than the right side, often resulting in the mean being smaller than the median.

27
New cards

Stratified sample

A sampling method where the population is divided into homogeneous subgroups (strata), and then a simple random sample is drawn from each subgroup.

28
New cards

Observational study

A study where researchers observe and measure characteristics of subjects without attempting to influence or manipulate any variables.

29
New cards

Experiment

A study where researchers actively impose some treatment on subjects in order to observe their responses.

30
New cards

Statistically significant

A result that is unlikely to have occurred by random chance, suggesting a real effect or relationship.

31
New cards

Explanatory variable

A variable that is thought to explain or cause changes in another variable (the response variable); also known as the independent variable.

32
New cards

Confounding variable

An unmeasured variable that influences both the explanatory and response variables, creating a spurious association.

33
New cards

Response variable

The variable that measures an outcome of interest; also known as the dependent variable.

34
New cards

Predictive power

How well a model or variable can forecast future outcomes, often assessed by R-squared or correlation.

35
New cards

Five-number summary

A set of five values that describe the distribution of data: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.

36
New cards

Lower fence for outliers

A threshold used to identify potential outliers, typically calculated as Q1 - (1.5 * IQR).

37
New cards

Upper fence for outliers

A threshold used to identify potential outliers, typically calculated as Q3 + (1.5 * IQR).

38
New cards

Experimental unit

The smallest unit to which a treatment is applied in an experiment.

39
New cards

Blinding (in experiment)

The practice of keeping subjects and/or researchers unaware of the treatment assignments, to prevent bias from expectations.