individuals/cases (ROWS)
the objects described by a set of data; may be people, animals, things
variable (COLUMN)
any characteristics of an individuals; can take different values for different individuals
categorical variable
places an individual into one of several groups or categories
quantitative variable
takes numerical values for which it makes sense to find an average
distribution (of a variable)
tell us what values the variable takes and how often it takes these values
frequency
count
relative frequency
percent
pie chart (categorical)
emphasizes category’s relation to the whole
bar graph (categorical)
can compare any set of quantities measured in the same units
two lessons abt bar graphs:
1) Don't use pictures; ensure bars are equally wide
2) Ensure vertical scales start at 0
two-way tables
describes two categorical variables (variables are in both row and column)
marginal distribution (of categorical variable)
the distribution of value of that variable among all individuals described in the table
percents more helpful
don’t tell us anything abt relationships b/w two variables
conditional distribution (of a categorical variable)
describes values of that variable among individuals who have a specific value of another variable; separate conditional distrib. for each value of the other variable so two sets for a two-way table
segmented bar graph
the distribution of a categorical variable as segments of a rectangle, with the area of each segment proportional to the percent of individuals in the corresponding category
joint distribution/frequency
how many times a combination of two conditions happens together
association
exists if knowing the variable of one variable helps preditcs the value of the other
no association
conditional distributions would look the same; segmented bar graphs would look the same.
mosaic plot
a modified segmented bar graph in which the width of each rectangle is proportional to the number of individuals in the corresponding category
side-by-side bar graph
displays the distribution of a categorical variable for each value of another categorical variable. The bars are grouped together based on the values of one of the categorical variables and placed side by side