Statistics Lecture Notes Flashcards

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/54

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards for Statistics Lecture Notes

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

55 Terms

1
New cards

Density Curve

A smooth curve meant to represent the expected shape of a population.

2
New cards

Uniform Density Curve

A density curve where area equals length times width.

3
New cards

Area under the Density Curve

The proportion, or percent, of all observations that fall within a range.

4
New cards

Empirical Rule

Data follows a normal distribution if 68% of the data is within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3.

5
New cards

Shape (SoCs)

Describes the overall shape of a distribution (approx. normal, skewed, symmetric, unimodal, bimodal, uniform).

6
New cards

Outliers (SoCs)

Data values that are far away from the rest of the data, identified using a modified boxplot or the interval [Q1 - 1.5IQR, Q3 + 1.5IQR].

7
New cards

Center (SoCs)

A 'typical' value for the data; can be the median (Q2) or the mean (X or μ).

8
New cards

Spread (SoCs)

Tells how much the data varies; measured by range, IQR (Q3-Q1), standard deviation (S or σ), or variance (S^2 or σ^2).

9
New cards

Boxplot

Displays the 5-number summary of a dataset (min, Q1, median, Q3, max).

10
New cards

pth Percentile

The data value in which p% of the individual data values are less than or equal to the data value.

11
New cards

Z-score

A standardized score that tells how many standard deviations from the mean a data value is.

12
New cards

LSRL (Least Squares Regression Line)

Predicts the response variable y-hat based on the explanatory variable x.

13
New cards

Residual

Gives how far away the actual y-value is from the predicted y-value (residual = actual - predicted).

14
New cards

Extrapolation

When the model is used to make predictions for x-values very far from the domain of the dataset.

15
New cards

Coefficient of determination (r^2)

About r^2% of the variability in y is accounted for by the LSRL.

16
New cards

y-intercept (a)

The predicted y-value when x=0.

17
New cards

S (standard deviation of the residuals)

The actual y is typically about S (standard deviation of residuals) away from the value predicted by the LSRL.

18
New cards

Linear (Relationships Between Two Numerical variables)

Looks at the scatterplot or residual plot.

19
New cards

Strength (Relationships Between Two Numerical variables)

Looks at the scatterplot or correlation coefficient r. Ranges from -1 to 1.

20
New cards

Direction (Relationships Between Two Numerical variables)

Positive means as x increases, y increases. Negative means as x increases, y decreases. The r value also indicates direction.

21
New cards

Center (Combining Random Variables)

Mean and all centers.

22
New cards

Spread (Combining Random Variables)

Standard deviation and all spreads.

23
New cards

Observational Study

The researcher does not impose a treatment; thus, no causation can be claimed.

24
New cards

Experimental Study

The researcher imposes a treatment on experimental units.

25
New cards

Explanatory variable

Helps predict/explain the response or the treatment 'cause'.

26
New cards

Response variable

The outcome being measured or the 'effect'.

27
New cards

Confounding variable(s)

Other factors that affect the response (and explanatory) that need to be controlled.

28
New cards

Completely randomized design

Randomly assign each EU a treatment.

29
New cards

Randomized block design

Match similar EUs (block size 2), then randomly assign each EU a treatment OR separate EUs into blocks, then randomly assign each EU a treatment within each block.

30
New cards

Matched pairs design

Each EU receives every treatment, but the order of treatment is randomized.

31
New cards

Population

A whole group of individuals you want to know about.

32
New cards

Sample

A subset of individuals you collect data from.

33
New cards

Parameters

Describe a population.

34
New cards

Statistics

Describe a sample.

35
New cards

Selection Bias/undercoverage

The way in which you choose a sample that leads to an unrepresentative sample.

36
New cards

Response Bias

The way in which you collect data from your sample that leads to misinformation.

37
New cards

Wording Bias

The wording of the survey leads to bias.

38
New cards

Measurement Bias

The tool used to collect data leads to bias.

39
New cards

Nonresponse Bias/Voluntary response bias

Not every individual in your sample provides data, or letting individuals decide to be part of your sample, may lead to bias.

40
New cards

Simple Random Sample (SRS)

Hat, random digit table, randInt().

41
New cards

Stratified Random Sample

Divide population into strata, then randomly pick some from each stratum.

42
New cards

Cluster Random Sample

Divide population into clusters, then randomly pick whole clusters.

43
New cards

Systematic Random Sample

Start at a random place, then pick every kth individual.

44
New cards

Convenience Sample

Pick individuals that are easy to collect data from.

45
New cards

Plan the simulation

Describe how to use a chance device to imitate one repetition of the process.

46
New cards

Simulations

As a special promotion for its 20-ounce bottle of soda, a soft-drink company printed a message on the inside of each bottle cap.

47
New cards

Permutation

n!/(n-r)!.

48
New cards

Combination

n! / (n-r)!r!.

49
New cards

Random variable

A variable that describes the outcome of a chance process.

50
New cards

Discrete RV

Takes on a fixed number of values with gaps between values.

51
New cards

Continuous RV

Takes on values from an interval.

52
New cards

Mean (Random variables)

The expected value aka the long-run average of X.

53
New cards

Standard deviation (Random variables)

Gives the average distance each value is from the mean.

54
New cards

Binomial Random Variable

Each trial has exactly two possible outcomes: success or failure, each trial's outcome does not affect the next trial.

55
New cards

Geometric Random Variables

Number of trials until first success.