1/54
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Individuals
Objects described by a set of data, may be ppl, animal, things
Variables
Any charcteristic of an individual we measure
Data Analysis
The proccess of organizing, displaying and asking Q’s abt Data
Categorical Variable
Places an individual into one of several categories and groups
Quantitative Variable
Takes a numerical value for which it makes sense to find the average
Distribution
Tells us the values a variable takes and how often it takes those values
Inference
Drawing conclusions that go beyond the data at hand
Frequency Table
Displays the counts
Relative Frequency Table
Displays the percents
Roundoff Error
Proccess in which percents dont round exactly to 100 due to rounding of numbers
What distributions are meant for categorical variables?
Pie Chart and Bar Graph
Pie Chart
Shows the distribution of a categorical variables as a pie
Must include all the categories that make up a whole and emphasive each categories relation to the whole
Bar Graph
Represents each category as a bar
Bar heights show the category of counts of percent
Bar graphs are easier to make and read then pie charts
Why can graphs be problematic?
You have to watch your scales (Area/width) and beware of photographs
Two-way table
Describes 2 categorical variables
Marginal Distributions
Appear at right and bottom margins of the two-way table
distribution of values of that variable among all individuals described in the table
What is the downside of a marginal Distribution
They tell us nothing about the relationship between 2 variables
Conditional Distribution
Describes the values in that variable among individuals what have a specific value for another variable
Segmented Bar Graph
Problematic because difficult to compare percents of males and females in each category b/c middle segments in 2 bars start at different locations at the vertical axis
Association
Theres an association between varibles if knowing the value of one variable helps to predict the value of the other
How to describe a dot plot distribution (Shape)
Peaks, Gaps, clusters
Center
Midpoint (Signifies in a typical (scenario) wewill do (Blank))
Spread
Range
Outliers
Any data that stands out from the distribution
Roughly Symmetric
When right and left sides are approx. mirror images
Shewed to the right
Right side of graph contains more observations and is much larger then the left side
Where most of the data is low values and a couple high values skew it to the right
Skewed to the left
Left side is much longer on the left side
When there are mainly high scores and a couple low scores skew the distribution to the left
What meaurements are typically right skewed
House prices and salaries
Splitting Stemplot
0-4 are placed on 1 stem and 5-9 are placed on the next
T or F Do stemplots dont work well on large data set
True, esp where each stem shoudl hold a large number of leaves
T or F too little of too many stems doesnt make a difference in a dotplots distribution
False
When you split stems is it T or F that each stems must have an equal amount of possible leaf digits
True
Histogram
Grouping of nearby quantitative variables (FINISH)
When Do you Use percents for comparing distributions w/ diff number of observations
In Histograms!
Mean
Add up all values and divide by n
Is mean a resistant measure of center?
NO! SENSITIVE to the influence of extreme observations
What does the mean tell us?
Tells us how large each data value would be if the total was split equally among all observations
Median
Midpoint of the Distribution, the number that such half the observations are smaller and half are larger
What do the mean and median look like in a roughly symmetric distribution?
Mean and median are close to eachother
What do the mean and median look like in a exactly symmetric distribution?
Mean and median are same
What do the mean and median look like in a skewed distribution?
Mean goes towards farther out in the long tail that the median
Range
Simplest measure of variability
Q1
Median of observations that are to the left of the median in the ordered list
Q3
Median of observations that are to the right of the median in the ordered list
IQR
Range of middle 50% of data
Q3-Q1
Are quartiles and IQR resistant measures?
Yes! B/c not affected by extreme outliers
1.5 X IQR
We consider an observation an outlier when it falls more than 1.5*IQR above the third quartile
5 Number Summary
Consists of the Small observation, Q1, Median, Q3, largest observation written from smallest to largest
Boxplot
Central box is between Q1 and Q3
Lines extend from min and max
Median is marked by a line
Outliers marked w/ an asterick
Variance
measure spread by looking at how far the observations are from their mean
Standard Deviation
Measures typical distance of the values in a distribution from the mean
T or F Sx measures spread of mean and should not be used when only the mean is chosen as the measure of center
F, Measures soread of mean and SHOULD be used if mean is chosen as measure of center
T or F Sx is always larger or equal to 0
T, sx=0 means theres no variability
as observations spread out, the sx does as well
Is Sx Resistant?
NO!
Whats measures of center and spread should we use for skewed distributions or distributions with strong outliers?
Median and IQR=Resistant