Statistics & Data Analysis Unit 3

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/28

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

29 Terms

1
New cards

Displaying Data…

  • Organizes data

  • Helps show similarities and differences

2
New cards

Categorical Variables

  • Bar charts

  • Pie charts

  • Two way tables

3
New cards

Quantitative Variables

  • Dot plots

  • Stem plots

  • Box plots

  • Histogram

4
New cards

Bar Graph

  • Vertical or horizontal bars

  • Can show frequency (#) or relative frequency (%)

  • Must have spaces between bars

5
New cards

What do you need in a bar graph?

  • Title

  • Y-axis label

  • X-axis label

  • Spaces in between bars

6
New cards

Pie Charts

  • Always in Relative Frequency (%)

  • Compares parts of a whole

  • Title

  • Key

7
New cards

Two Way Tables

Used to compare observations for two different categorical variables

  • Column variable (Top to bottom)

  • Row variable (Left to Right)

8
New cards

Marginal distribution

The totals converted to %

9
New cards

Stemplots

  • Leaf is the final digit of the number

  • Stems are always on the left side of the line, leaves are on the right side

  • Stems are ordered least (top) to greatest (bottom)

  • All stems must be written

  • Spacing is important

  • No commas between each leaf

  • Always have a key

10
New cards

Back to back stemplots

Compares two sets of data

11
New cards

Splitting Stems

  • Used when the data is clustered on one or just a few stems

  • 2 stems for every 10 digit or 5 stems for every 10 digits

12
New cards

Histograms

  • Observations are placed into bins/bars of equal widths

  • Count how many observations are in each bin & draw bar to that height

  • Similar to bar chart but there are no spaces between bars/charts

  • Bin width can be changed but you need a minimum of 5 bars

  • Label the x-axis, y-axis, and title

  • Can do % or # on y-axis

13
New cards

Describing Distributions

You must include the

  • Shape

  • Center

  • Spread

14
New cards

SHAPE- symmetric or skewed

Symmetric- left and right sides are approximate mirror images

Skewed Left- majority of the data is on the right side, but trails out to the left

Skewed Right- majority of the data is on the left side, but trails out to the right

15
New cards

SHAPE- modes

Unimodal- one major mode where the data is collected around

Bimodal- two major modes

Multimodal- multiple modes

Uniform- data is flat

16
New cards

SHAPE- unusual features

Outliers- observations away from the main distribution

Gaps- spaces between clumps of data

17
New cards

CENTER- mean or median

Mean- used with symmetric graphs (no outliers)

Median- skewed left/right (has outliers)

18
New cards

SPREAD

Standard deviation- goes with mean

IQR- goes with median

  • Always list the range of the data (min,max)

19
New cards

Center

Mean: Average

  • only used for symmetrical, no outliers

Median: 

  • Put #’s in order from least to greatest

  • Find middle number

  • If there are two middle numbers, then aveage the two #’s

20
New cards

Quartiles

  • The medians of the upper and lower half

  • First (Q1) and third (Q3) quartile

21
New cards

Spread

Spread: the spread that goes with the median is the IQR & Range

Range: (min,max) or Range= max-min=#

IQR: 

  • The difference between Q3 and Q1

  • IQR= Q3-Q1

  • Represents the middle 50% of the data

22
New cards

5 number summary

  • Minimum

  • Q1

  • Median

  • Q3

  • Maximum

  • Outliers?

23
New cards

Finding outliers: IQR Test

  1. Find 1.5(IQR)

  2. Use that #: subtract from Q1/A and Q3/B

  • A= Q1-(1.5 times IQR)

  • B= Q3+(1.5 times IQR)

  1. (A,B) is that acceptable range

  2. Any # outside (A,B) is an outlier

24
New cards

Spread: Standard Deviation

  • SD goes with the mean

  • Decribes how spread out the data is (Variability)

  • Symbol: s

  • The higher the # is, the more spread out the data

25
New cards

Properties of Standard Deviation

  • s=0, when all data points are the same number

  • s is always positive

  • Standard deviation is affected by outliers

26
New cards

Mean vs. Median in distributions

Symmetric

  • mean=median

Skewed Left

  • mean < median

Skewed Right

  • mean > median

Rules:

  • The mean is affected by outliers

  • The median is resistant to outliers

27
New cards

Comparing Distributions

  • When comparing 2 distributions, you must still… state the shape, center, and spread of both distributions.

  • You must also compare the 3 things in each distribution

  • Use comparison words: higher, lower, similar, same, wider, etc.

  • Cannot compare mean to median

GUIDE:

The shape of D1 is ____ which is similar to/different than the shaoe of D2 which is ____. Both are unimodal.

The center of D1 is the mean/median of ____ which is higher/lower than the mean/median of D2 which is _____. 

The IQR/SD of D1 is _____ which is wider/smaller than the IQR/SD of D2 which is _____. The range of D1 is (_,_) which is wider/smaller than the range of D2 which is (_,_). 

Other, outliers, gaps.

28
New cards

If I ADD an upper outlier, how will it affect the summary stats?

Mean-increase

Median-stay the same

Range-increase

IQR- stay the same

SD- increase

29
New cards

If I ADD an lower outlier, how will it affect the summary stats?

Mean-decrease

Median-stay the same

range-increase

IQR-stay the same

SD-increase