Q1 Stat Cumulative

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/120

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

121 Terms

1
New cards

Response Variable(y)

Dependent variable, the outcome of a study

2
New cards

Explanatory Variable(x)

Independent variable, attempts to explain observed outcomes.

3
New cards

Direction, Form, Strength, Outliers, and Context

Requirements for describing scatterplots

4
New cards

Direction

Positive or negative association of a graph

5
New cards

Form

Is the line of best fit linear or some other pattern

6
New cards

Strength

The strength of correlation between explanatory and response variables(r)

7
New cards

Outliers

Points which don’t follow the pattern

8
New cards

Context

The context of the explanatory and response vcariables

9
New cards

Correlation(r value)

Measures the strength and direction of the linear relationship between two quantitative variables. Does not have units, and is heavily influenced by outliers.

10
New cards

Positive r value

There is a positive correlation between the two variables, the slope of the graph is positive.

11
New cards

Regression Lines

The “trendlines“ which describe how y changes with respect to x. Has an equation of the form y^ = a + bx

12
New cards

y-hat (y^)

The predicted y value or value of the response variable given a explanatory variable

13
New cards

Extrapolation

Using the regression line far outside the range of x used to create it. Often makes inacurate predictions.

14
New cards

Residual

y-hat minus y, the difference between an observed value and the predicted value.

note: a residual of 0 means the point is on the regression line.

15
New cards

LSRL(Least-squares regression line)

Minimizes sum of squared residuals, always goes through (x̄, ȳ). Equation is found through y^ = a + bx. minimizes the sum of the residues squared. 

16
New cards

Standard Deviation of Residuals(S)

Approximate size of average residue when using the LSRL. 

17
New cards

There is a [strong/moderate/weak] [positive/negative] [shape] relationship between [x-context] and [y-context].  There [do/do not] appear to be any outliers.

Interpreting scatterplots

18
New cards

There is a [strength], [direction] linear relationship between [variable1] & [variable2].

Interpreting correlations(r)

19
New cards

As [explanatory variable] increases by one [unit], the predicted [response variable] [increases/decreases] by [b] [units]. 

Slope of a regression line(b)

20
New cards

When [x-context] is zero, the predicted [y-context] is [a] [units]. 

Interpreting y-intercept of a regression line(a)

21
New cards

[y-hat] is the predicted value of [response variable y] when [explanatory variable] is [input amount]. 

Interpreting predicted value (y-hat)

22
New cards

The actual/true [y-context] is [higher/lower] than the predicted [y-context] by [residual].

Interpreting residuals

23
New cards

When using the LSRL, the predicted number of [y-context] is typically [s units] off from the actual number. 

Interpreting standard deviation of the residuals

24
New cards

About [r2]% of the variability in [y-context] is accounted for by the least-squares regression line. 

Interpreting coefficient of determination (r2)

25
New cards

Coefficient of determination (r2)

Shows how well the regression line accounts for variance within your collection of points.

26
New cards

stat, calc, 8

Finding regression line, r, and r² on calculator

27
New cards

stat, calc 8, use vars, Y-vars, function, y1

Storing and graphing the regression line

28
New cards

stat, calc, 8, to find linear regression line, then plot residual plot using stat plot, scatterplot, Xlist: L1, Ylist: 2nd, stat, RESID, then Graph.

Graphing a Residual Plot

29
New cards

Categorical Variable

Names, labels

ex: marital status, eye color, race

30
New cards

Quantitative Variable

Makes sense to average

ex: height, test scores

31
New cards

Frequency

How many times something occurred

32
New cards

Marginal Distribution

Subset out of the whole sample.

33
New cards

Conditional Distribution

Parts out of a group/subgroup

34
New cards

Individuals

Subjects described by data set

35
New cards

Variable

Characteristic of individual

36
New cards

Association

One variable influences the other. The two are associated, but do not use correlated instead of associated.

37
New cards

Stacked/Segmented Bar Graph

Displays all components of a whole on top of each other as part of a whole bar.

<p>Displays all components of a whole on top of each other as part of a whole bar. </p><p></p>
38
New cards

Mosaic Plot

Segmented bar graph but widens the bars to be proportional to their respective sample sizes.

<p>Segmented bar graph but widens the bars to be proportional to their respective sample sizes.</p><p></p>
39
New cards

Dot Plot

Each data value is shown as a dot above its respective value on a number line.

<p>Each data value is shown as a dot above its respective value on a number line.</p>
40
New cards

Stem and Leaf Plot

Vertical Bar Graph but denotes the points as numbers with EVENLY SIZED BUCKETS. 

Requires a key for the buckets, can just say “5-9” for one bucket

<p>Vertical Bar Graph but denotes the points as numbers with <strong><u>EVENLY SIZED BUCKETS.&nbsp;</u></strong></p><p>Requires a key for the buckets, can just say&nbsp;“5-9” for one bucket</p>
41
New cards

Bar Graph

Denotes the amount of people who answered each response by a vertical y axis.

<p>Denotes the amount of people who answered each response by a vertical y axis.</p>
42
New cards

Histogram

Divides points by frequency and graphs similar to a bar graph, but with conjoined bars and buckets.

MUST INCLUDE A KEY FOR THE BUCKET SIZES. Can be 5 - 9

<p>Divides points by frequency and graphs similar to a bar graph, but with conjoined bars and buckets.</p><p><strong><u>MUST INCLUDE A KEY FOR THE BUCKET SIZES.</u></strong> Can be 5 - 9</p>
43
New cards

SOCCS

Shape, Outliers, Center, Context, Spread

44
New cards

Shape

Noteworthy features of graph. Unimodal(one hump), Bimodal(two humps), multimodal(many humps), Skew, or irregular(doesn’t fit any description).

45
New cards

Skew

Distribution of individuals on one side or the other of a graph. Usually the lower, more severe portion of the graph determines the skew.

ex: if the graph is a histogram with most people answering 50-60, but a fair few answered 60-70 and 70-80, we would say the graph is skewed right.

46
New cards

Outliers

Points that are statistical anomalies, that are greater than Q3 + 1.5(IQR) or less than Q1 - 1.5(IQR). 

47
New cards

Center

Can be median or mean depending on the skew of the graph, but describes the center.

48
New cards

Context

Context behind the data. Say “The (variable) typically varies from the mean by about…(sx with units)“ or “on average, a (variable) is about (sx with units) away from mean.“

49
New cards

Spread

Describes the distribution of the graph. Can use range, standard deviation, or IQR.

50
New cards

Range

Max - Min

51
New cards

IQR

Q3 - Q1

52
New cards

Standard Deviation

Formula given on equation sheet but essentially the average difference between an individual and the mean.

denoted by sx on equation sheets and in writing

53
New cards

Median

The middle term of a distribution.

ex: in a sample of 11 people, the 6th person’s (after ordering from least to greatest) response would be the median. If the group was reduced to 10, the median would be the average of the 5th and 6th people’s answers.

54
New cards

Q3

The median of the upper half of the graph, excluding the median if it is just the middle term. If the median is the average of the two closest terms, take the greater term and include it in the median calculation.

55
New cards

Q1

The median of the bottom half of the graph, excluding the median if it is just the middle term. If the median is the average of the two closest terms, take the lower term and include it in the median calculation.

56
New cards

IQR

Q3 - Q1

57
New cards

Mean

Average of all responses from the sample. denoted by x with a bar on top in the equation sheet.

<p>Average of all responses from the sample. denoted by x with a bar on top in the equation sheet.</p>
58
New cards

5 number summary

Min, Q1, Median, Q3, Max

59
New cards

Box Graph

Uses the 5 number summary to graph the distribution. Uses the Q1, median, and Q3 values to draw the box in the middle, with the median denoting the line in the middle of the graph. Min and Max describe the outer “stops“ of the graph.

60
New cards

Outliers in Box Graphs

Outliers are denoted either by an asterisk if wished, with the max/min being reduced to the next greatest/lowest point. If not denoted, nothing is needed.

61
New cards

Getting 1 Var Stats on graphing calculator

STAT —> CALC —> 1-Var Stats

62
New cards

How to create a list on graphing calculators

STAT —> EDIT —> input values into a list, likely L1

63
New cards

How to graph various graphs with graphing calculator

STAT —> WINDOW(set appropriate range) —> STAT PLOT(turn appropriate list on) —> GRAPH

64
New cards

Effect of adding to set values to mean, std dev, and shape

std dev and shape unchanged, mean is increased by the constant added to every value.

65
New cards

Effect of multiplying set values by a constant on mean, std dev, and shape

The mean and std dev are multiplied by the proportionate amount but the shape is unchanged.

66
New cards

Cumulative Frequency Line Graph

Shows the relative frequency as the values of x increase, the frequency increases to 100%. Can show percentile very easily as it shows what % of the total population has a value less or equal to every given value.

67
New cards

z-score

Shows the value of an individual member of the population as a function of the mean and std dev. Found by dividing the difference between the point and the mean by the std dev.

68
New cards

Density Curve

Curve that traces the top of a histogram and has an area under the curve of 1. Is an approximate shape of the histogram, and is always above the x axis.

mean < median = skewed left, mean > median = skewed right.

69
New cards

Required notation given mean “µ“, std dev “σ”, where x = height of adult american males(m)

x ~ N(µ, σ), x = height of american male(m)

70
New cards

Notation showing proportion of the whole of x population with a value of less than c

P(x<c)

71
New cards

< vs ≤

Interchangeable in this unit.

72
New cards

Normal distribution

A symmetric unimodal bell-shaped curve, which abides by the empirical rule.

73
New cards

Points of inflection on a normal curve

At ± 1 std dev

74
New cards

Empirical Rule(68, 95, 99.7)

Rule which states that 68% of a population lies between ±1 std dev, 95% between ± 2 std devs, and 99.7% between ± 3 std devs.

75
New cards

Table A

Shows the proportion less than a given z score and vice versa.

76
New cards

How to calculate the percentile given a max

2nd → Distr → normalcdf → set max, min, std dev, paste and calculate

77
New cards

How to calculate z-scores given percentile

2nd → Distr → Inv norm → Set max, min, mean, std dev, paste and calculate.

78
New cards

How to create a normal probability plot to prove whether a distribution is normal

Insert data into list → 2nd → stat plot → 6th graph option → plot. A normal distribution will have a linear pattern, while skewed graphs will be warped.

79
New cards
<p>Using the normal probability plot to determine leftward skew</p>

Using the normal probability plot to determine leftward skew

A normal probability plot which is skewed left will have a flatter slope and the points of z-score < 0 clumped together as a greater proportion of the population would be clustered on the left.

<p>A normal probability plot which is skewed left will have a flatter slope and the points of z-score &lt; 0  clumped together as a greater proportion of the population would be clustered on the left.</p>
80
New cards
<p>Using the normal probability plot to determine rightward skew</p>

Using the normal probability plot to determine rightward skew

A normal probability plot which is skewed left will have a steeper slope and the points of z-score > 0 clumped together as a greater proportion of the population would be clustered on the right.

<p>A normal probability plot which is skewed left will have a steeper slope and the points of z-score &gt; 0  clumped together as a greater proportion of the population would be clustered on the right.</p>
81
New cards

Convenience Sample

A sample which is convenient for the surveyors

ex: choosing the closest 5 students as a sample

82
New cards

Simple Random Sample(SRS)

Simple random selection such that each individual has a equal chance of being selected.

ex: label each person with a number from 1-100, write numbers on 100 slips of paper and randomly choose a certain amount of slips corresponding to the size of your groups.

83
New cards

Population

The group that a statistical study can be extrapolated to

84
New cards

Census

Data from every individual in the population

85
New cards

Sample

Subset of population from which data is actually collected

86
New cards

Bias

Consistent underestimation or overestimation of the value you are studying.

87
New cards

Voluntary response bias

Bias derived from a sample volunteering themselves for a study/experiment

88
New cards

Random Sampling/assignment

Using random chance to assign a sample

89
New cards

Strata

Groups of similar individuals, for random selections for sampling. “some of all“, choosing samples randomly from subcategories of the population.
ex: splitting a school into grades in preparation of randomly choosing 50 people from each grade. The grades are:

90
New cards

Stratified random sample

Sampling method where the population is divided into several strata of similar individuals prior to simple random sampling to take some individuals from each strata.

91
New cards

Clusters

”All of some” groups of individuals within a population

92
New cards

Cluster sample

Simple random sample of clusters, then averaged to an overall sample
ex: think “all of some“. if only a few high schools in all of Massachusetts are selected for a experiment but all students within those schools are selected its a cluster sample.

93
New cards

Inference

Drawing conclusions about a population based on the results derived from a sample

94
New cards

Undercoverage

Response bias resulting from some members of population being unable to participate.

95
New cards

Nonresponse

When a subcategory of the population refuses to be part of the sample

96
New cards

Response bias

Systematic pattern of inaccurate responses leading to incorrect results

ex: unemployed people in sample claiming to be employed and lying in a experiment.

97
New cards

Question wording

Confusing wording that leads to profoundly different results in surveys or orientation of separate questions that leads to profoundly different results to the survey question

98
New cards

Systematic sample

Selecting a sample by every nth individual

ex: within a population, assign each person a number, then take every 9th person starting from a random initial person. These people make up your sample.

99
New cards

Random Number Generator(RNG) Method

Assigning each member of the sample a number then using a random number generator to assign separate unique numbers to different treatments.

100
New cards

Hat Method

Assigning each member of your sample a number, writing them on equal sized slips of paper and placing them in a hat, then manually drawing the slips of paper out to make up your separate samples/treatments