Unit 1: Stats

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/25

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

26 Terms

1
New cards

Observational Study

Study in which the person conducting the study observes characteristics of a sample selected from oner or more existing populations.

2
New cards

Experiment

intentional effort toinfluence individuals in a study (cause and effect)

3
New cards

Parameter

Number that describes an entire population “all”

4
New cards

Statistic

Number that describes a sample is called (usually number or percentage)

5
New cards

Simple Random Sampling

Equal chance of winning (lottery system)

Ex.. place all student ID numbers in a bin and ramdly select 600 blugolds for your sample

6
New cards

Stratified Sampling

Split group then select from each group

ex.. put all names with the same last digit into ten different bins (one for each digit, 0-9).Then randomly select 50 names from each bin for your sample.

7
New cards

Systematic Sampling

every kth person

ex.. start with the 4th ID number use every 15th student for your sample

8
New cards

Cluster Sampling

split group then select entire group

ex.. put all names with the same last digit into ten different bins (one for each digit, 0-9). Then randomly select a single bin and use all students in that bin for your sample.

9
New cards

Convenience Sampling

easiest to reach (nearby)

ex.. have a table at Blus org bash and log the opinions of hte first 600 blugolds that stop by

10
New cards

Qualitative (categorical) - Graphical Methods for Describing Data

classify individuals usually words (SSN)

11
New cards

Quantitative (Numerical) - Graphical Methods for Describing Data

numbers - operations like addition & subtraction

12
New cards

Discrete (numerical)

countable # of outcomes (the number of people in a class, test questions answered correctly, and home runs hit, tables, or information displayed in columns and rows, and graphs)

13
New cards

Continuous (numerical)

many outcomes, uncountable (Height, weight, temperature and length)

14
New cards

Measures of Center

Mean: add all the # then divide by amount

Median: middle # from lowest to highest

Mode: no mode

ex.. if a team payroll increases —> mean = change, increases median = no change

15
New cards

Resistant

resistant describes a measure or method that is not strongly affected by extreme values (outliers)

range does not equal to resistant

median is resistant, mean is not

16
New cards

Range

is a measure of dispersion: measures how spread out the data are.

maximum value − minimum value

17
New cards

Variance (Measure of Dispersion/Variability/Spread)

the average of the squared distances from the mean, the value of the standard deviation squared

18
New cards

Standard Deviation (Measure of Dispersion/Variability/Spread)

the square root of the variance, showing the typical distance from the mean

19
New cards

Z-scores

the # of standard deviations a data value is from the mean (no units)

mean (sample & population)= 0

standard deviation (sample & population) = 1

20
New cards

Percentiles

If a value in a data set represents the kth percentile then k% of the values in the data set are located at or below the value

21
New cards

Quartiles First Three and (IQR) | Outliers

The three quartiles

  1. Q1 (First Quartile)

    • The median of the lower half of the data

    • 25% of the data fall below Q1

  2. Q2 (Second Quartile)

    • The median of the data

    • 50% of the data fall below it

  3. Q3 (Third Quartile)

    • The median of the upper half of the data

    • 75% of the data fall below Q3

Interquartile Range (IQR) = measure of variability that is resistant to the effects outliers

  • IQR = Q3 − Q1

  • Measures the spread of the middle 50% of the data

  • Resistant to outliers

Outliers: A value is an outlier if it is:

  • Less than:
    Q1−1.5×IQRQ1 - 1.5 \times IQRQ1−1.5×IQR

  • Greater than:
    Q3+1.5×IQRQ3 + 1.5 \times IQRQ3+1.5×IQR

22
New cards

5 Number summary

Min, lower quartile, median, upper quartile, maximum

23
New cards

Residual

vertical distance from point to line

(actual response - predicted response)

24
New cards

Coefficient of Determination

measures the percentage of total variation in the response variable that is explained by the least squares regression line. (r²)

25
New cards

Influential Observation

a data point that has a large impact on the results of an analysis, especially on a regression line or correlation.

An observation is influential if removing it would noticeably change:

  • the slope of the regression line

  • the intercept

  • the correlation (r)

26
New cards

Residual Plots

a residual plot is a graph that shows the residuals on the vertical axis and the explanatory (x) variable on the horizontal axis.

Pattern in residual plot

What it means

Random scatter around 0

Linear model is appropriate

Curve or systematic pattern

Nonlinear relationship, linear model not appropriate

Increasing/decreasing spread

Heteroscedasticity (variance not constant)

Points far from zero

Possible outliers