Exploring Data!

0.0(0)

Studied by 3 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/19

Earn XP

Description and Tags

Statistics

A-Level Statistics

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

20 Terms

New cards

Describe a distribution

CUSS and BS

Center: mean, if skewed —> median

Unusual Features- "potential outliers"

Shape- skew, modal, normal, symmetrical, uniform

Spread- SD w/ mean, IQR w/ median

New cards

Outlier Rule

value that falls more than 1.5IQR above Q3 or Q1

Lower Outlier < Q1 - 1.5IQR

Upper Outlier > Q1 + 1.5IQR

New cards

How can we use a graph to compare the mean and the median?

Mean follows the tails, median at the peak

Skewed Left: Mean < median

Roughly Symmetric: mean ~ median

Skewed Right: Mean > median

New cards

Interpret the standard deviation

Standard deviation is the typical distance that the values are away from the mean

New cards

How do we describe the relationship between the two variable (like in a scatterplot)?

Direction- positive/negative

Unusual features- outliers, influential observations

Form- linear or curved

Strength- weak --> strong

New cards

Compare two distributions

CUSS + BS

Use comparison words: "similar to" "less/greater than"

New cards

How to find the mean, SD, and 5-number summary using a graphing calculator

Enter data in List 1

Stat -->Calc

1-Var Stats

Leave "FreqList" blank. Select Calculate.

New cards

How to calculate a LSRL using a graphing calculator

Enter the x-values in L1 and the y-values in L2

Stat -->Calc

8: LinReg (a+bx)

Leave "FreqList" blank. Select Calculate.

New cards

What is the IQR?

The Interquartile range (IQR) is defined as the difference between the third and first quartiles: Q3 - Q1

Q1 and Q3 form the boundaries for the middle 50% of values in an ordered data set

New cards

How do I calculate the percentile of a particular value in a data set?

-Order the date (little Lexi to the left)

-Count the # of values that are less than or equal to the value of interest

-Count the # of values in the data set

Percentile= #of values less than or equal to the value of interest/ # of values in the data set (Express the decimal as a percentile)

New cards

Interpret the y-intercept of the Least Squares Regression Line

The PREDICTED y-context when x-context is 0 is y-intercept value

New cards

Interpret the slope of the Least Squares Regression Line

The PREDICTED y-context will increase/decrease by (slope) with each additional 1 unit of x-context

New cards

interpret the coefficient of determination (r^2)

The coefficient of determination gives the percent of the variation of y-context that is explained by the least squares regression line using x = x-context

New cards

Properties of correlation (r)

-'r' is unitless

-'r' is always between -1 and 1

-'r' is greatly affected by regression outliers

-If direction is negative, then 'r' < 0

-If the direction positive, then 'r' > 0

-The closer 'r' is to -1 or 1, the stronger the relationship

-The closer that 'r' is to 0, the WEAKER the relationship

New cards

Regression Outlier

An outlier in regression is a point that does not follow the general trend shown in the rest of the data and has a large residual

New cards

Correlation (r)

gives the strength and direction of the linear relationship between 2 quantitative variables

New cards

High-Leverage Point

A high-leverage point in regression has a substantially larger or smaller x-value than other observations have

New cards

Influential Point

An influential point in regression is any point that, is removed changes the relationship substantially (creates big changes to slope and/or intercept)

Outliers and high-leverage points are often influential

New cards

What is the difference between categorical and quantitative variables?

A categorical variable takes on values that are category names or group labels

A quantitative variable is one that takes on numerical values for a measured or counted quantity

New cards

What is the difference between discrete and continuous variables?

A discrete variable can take on a countable number of values. The number of values may be finite or infinite. (THINK: Discrete = countable, ex: # of ppl)

A continuous variable can take on infinitely many values, but those values cannot be counted (THINK: Continuous = Must be measured, ex: height)