1/51
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Individual cases
The rows of a data table correspond to individual cases about Whom
Respondents
Individuals who answer a survey
Subjects or participants
People on whom we experiment
Experimental units
Animals, plants, websites, inanimate subjects
Records (in a database)
Rows
Generic word for Records
Cases (any event the rows represent the who of the data)
Variables
The characteristics recorded about each individual
Often, the cases are a _________ of cases selected from some larger population that we’d like to understand.
sample
Example of identifier variables
Social Security, student ID numbers, or Amazon Standard Identification Number (ASIN)
Categorical Variables
What a group or category each individual belongs to.
Quantitative variable
When a variable contains measured numerical values
Quantitative variables typically have _____
units
Area Principle
Says that the area occupied by a part of a graph should be proportional to the magnitude of the value it represents
Categorical variables are easy to summarize in a ________________ that lists the categories and how many cases belong to each one.
Frequency Table
Relative Frequency Table
Displays percentages (or proportions) rather than the counts in each category
Distribution
They show how the cases are distributed among the categories
Pie Charts
Show the whole group of cases as a circle. They slice the circle into pieces whose sizes are proportional to the fraction of the whole in each category
Categorical Data Condition
The data are counts or percentages of individuals in non-overlapping categories
When presented like this, in the mar-gins of a contingency table, the frequency distribution of one of the variables is called its _______________
Marginal distribution
conditional distribution
shows the distribution on one variable for a subgroup of individuals that satisfy a condition on the other variable
segmented bar chart
Bars stacked on of each other
Independent Variables
In a contingency table, when the distribution of one variable is the same for all categories of another variable
Side-by-side bar chart
A graph that weaves together two or more conditional distributions
Simpson’s paradox
to be careful when you average across different levels of a second variable. It’s always better to compare percentages or other averages within each level of the other variable. The overall average may be misleading
Gaps in a histogram
Actual gaps in the data and indicating an interval where there are no values.
Relative frequency histogram
replacing the counts on the vertical axis with the percentage of the total number of cases falling in each bin.
What is a Dotplot
A simple display. It just places a dot along an axis for each case in the data.
Cumulative distribution plot or known as a ogive
Quantitative Data Condition
The data are values of a quantitative variable whose units are known
When you describe a distribution, you should always tell about three things
Shape, spread, and center
Does the histogram have a single, central hump of data or several separated humps?
Modes
Unimodal
A histogram with one peak
Bimodal
Histograms with two peaks
Uniform
A histogram that doesn’t appear to have any obvious mode and in which all the bars are approximately the same height
To tell the histogram symmetric
Try to fold it along a vertical line through the middle and have the edges match pretty closely, or are more of the values on one side
Tails within a distribution
Thinner ends of a distribution
Skewed in a distribution
If one tail stretches out farther than the other, the histogram is said to be skewed to the side of the longer tail
Outliers in a distribution
Stand of away from the body of the distribution
Median
The middle value that divides a histogram into two equal areas
Range
The difference between the maximum and minimum value
Range Formula
Range = max - min
5-number summary
distribution reports its median, quartiles, and extremes (maximum and minimum)
Upper fence
= Q3+1.5 times IQR
Lower fence
= Q1 - 1.5 times IQR
What does Σ mean (Greek letter captial letter sigma)
Sum
(sigma is “_” in Greek)
S
Sample Mean Formula
x̄ = ( Σ xi ) ÷ n
x̄ means in the sample mean formula
Denotes the average value of the samples or sample mean
What does Σ mean
Standard Deviation
xi means in the sample mean formula
xi refers all X sample values
n means in the sample mean formula
stands for the number of sample terms in the data set
What is the sample mean formula?