1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
explanitory variable
(x)
response variable
(y) a variable that measures an outcome or result of a study
Data
any collection of numbers, characters, images or other items that provide information about something.
respondents
individuals who answer a survey
subjects
the people or animals participating in a research project
experimental units
animals, plants, websites, or other inanimate objects
records
Rows in the table
Example purchase records
who
people on whom we experiment or (subjects) / cases
how
How data was collected
Variable
a characteristic or attribute that can assume different values
categorical variable
a variable that names categories (whether with words or numerals)
Example : numeral categorical variable : Area Code
nominal variables
A variable whose values are used only to name categories.
categorical variables because they name categories
quantitative variable
a characteristic that can be measured numerically
identifier
an attribute whose value is associated with one and only one entity instance
Example: Alive or Dead
Student ID Number
ordinal variable
Variables that report order without natural units
model for data
Models are summaries and simplifications of data that help us understand.
what
"what" are the variables
why
helps you decide which way to treat the variables
area principle
the area occupied by a part of the graph should correspond to the magnitude of the value it represents
frequency table
A table that uses numbers to record data.
relative frequency table
Shows the percents or proportions (relative frequencies) of observations in each category or class.
The sum of the percentages need to add up to 100%
bar chart
gives an accurate visual impression of the distribution because it obeys the area principle.
pie chart
Displays a circle is divided into sectors which shows a percentage of a whole.
ring chart
or donut chart that is a modified form of pie chart that displays the "crust" of the pie.
Histogram
A graph of vertical bars representing the frequency distribution of a set of data.
stem and leaf display
is like a histogram, that shows the distribution of quantitative variables, but shows the distribution with individual values.
Dotplot
A simple graph that shows each data value as a dot above its location on a number line.
Density Plots
are like histograms with smooth bins that reduce the effect of bins in the distribution.
Shape
summarization of the distribution in three attributes.
1. How many modes it has.
2. whether its symmetric or skewed
3. whether it has any extraordinary cases or outliers.
modes
scale pattern. the pattern of hump or humps in a histogram.
the mode is sometimes defined as a single value that appears most often.
unimodal
a histogram with one peak
bimodal
a histogram with two peaks
multimodal
a distribution with three or more peaks in the histogram
uniform
The same all the way through; consistent
a histogram that doesnt appear to have any mode and in which all the bars are approximately the same height.
symmetric
Being equal or the same in size, shape, and relative position
tails
thinner ends of a distribution
skewed distribution
When the results are not symmetrical (appears to favor one side over the other)
if one tail stretches out farther than the other.
skewed right distribution
The peak of the data is to the left side of the graph. There are only a few data points to the right side of the graph.
the distribution tail is stretched out to the right
Skewed Left Distribution
The peak of the data is to the right side of the graph. There are only a few data points to the left side of the graph.
the distribution tail is stretched out to the left.
Outliers
extreme values that don't appear to belong with the rest of the data
Median
the middle score in a distribution; half the scores are above it and half are below it
the value of the distribution in the middle.
if (n) is odd the median is the middle value.
If (n) is even the median is the average of the two values in the middle.
mean
the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores
the point where the histogram is balanced.
range
the difference between the highest and lowest scores in a distribution
how you measure the spread.
Range = max - min
IQR (interquartile range)
measure of statistical dispersion, being equal to the difference between the upper and lower quartiles, IQR = Q3 − Q1.
ALMOST ALWAYS A REASONABLE SUMMARY OF SPREAD
Lower Quartile (Q1)
First Quartile
25th percentile
the median of the lower half of the data set
the value that is just above the lower 25% of the data values.
Upper Quartile (Q3)
3rd quartile
75th percentile
the value that is 75% or 75th percentile.
Median (50th percentile)
is the 50th percentile or second quartile.
standard deviation
takes into account how far each value is from the mean.
take the results of variance and square root it to get the standard deviation
least-squares property
sum of the squares of the residuals is the smallest sum possible
Residuals
the differences of the values from the mean.
variance
standard deviation squared
when you add up the squared residuals and divide by n-1, the result is the variance, denoted s^2 (s squared)
Conditional distribution
questions that restrict our attention to just one condition of a variable and ask about the distribution of another variable are asking about the conditional distribution.
conditional distribution show the distribution of one variable for only the cases that satisfy a condition on another variable.
Indipendent variable
in a contingency table, when the distribution of one variable is the same for all categories of another, the variables are independent. (there's no association between these variables.)