Vocab Chapter 1: Exploring Data

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/45

There's no tags or description

Looks like no tags are added yet.

Last updated 3:17 AM on 9/2/24

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

46 Terms

New cards

individuals

objects described by a set of data (e.g. people, animals, things)

New cards

variable

any characteristic of an individual; can be different values for different individuals

New cards

categorical/qualitative variable

records which of several categories/groups an individual belongs to; arithmetic is not meaningful

New cards

quantitative variable

takes numerical values for which it makes sense to do arithmetic (e.g. adding, averaging)

New cards

distribution

pattern of variation of a variable; records values the variable takes and how often it takes them; presentation of data

New cards

range

spread; high value - low value; gives interval of scores

New cards

spread

describes where data lies in a distribution; measured by range, standard deviation, variance, and/or M.A.D.

New cards

frequency

how many times the value of a variable occurs

New cards

outlier

an individual observation that falls outside the overall pattern of the graph; determined by eye or using the 1.5 IQR rule: if it’s less than Q₁ - 1.5*IQR or greater than Q₃ + 1.5*IQR, it’s an outlier

New cards

center

where the graph is centered; measured by mean, median, and/or mode

New cards

shape

the shape observations form in the distribution; described as skewed left/right or symmetric

New cards

skewed left

the left side (lower half of the distribution) extends much farther out than the right; the left side is the “tail”

New cards

skewed right

the right side (upper half of the distribution) extends much farther out than the left; the right side is the “tail”

New cards

symmetric

the right and left sides of the distribution are approximately mirror images of each other

New cards

dot plot

graph of data set using dots for each observation

New cards

histogram

graph with bars showing frequency of different values of one variable (not categories!); most common for quantitative variables, can use to group nearby values if too many values for a dot plot

New cards

stemplot

graph for a small data set that gives more info; stems are all but rightmost digit of observations, leaves are the final digit in decreasing order out from the stem (remember to include a key!)

New cards

split stems

each stem appears twice; do if all leaves would fall on just a few stems

New cards

back-to-back stemplot

stemplot with leaves on the right and left; use to compare two distributions (don’t forget a key with both distributions!)

New cards

time plot

graph plotting each observation against the time at which it was measured; use to show change over time

New cards

mean

most common measure of center; (∑x)/n; x̄ for sample mean and μ for population mean

New cards

∑

sigma; symbol meaning “sum of”

New cards

x̄ (x bar)

sample mean equal to (∑x)/n

New cards

nonresistant

sensitive to the influence of extreme observations; because the mean and standard deviation are nonresistant, they are pulled towards the tail

New cards

median

the middle value; M = med = x͂ = the (n/2)+1th value (or middle value) in odd functions and = the mean of the middle two values in even functions

New cards

resistant

not sensitive to the influence of extreme observations (e.g. median)

New cards

quartiles

spread; the quartiles make up the middle half of the data

New cards

Q₁

median of the observations below M; ¼ of the listed observations (25th percentile)

New cards

Q₃

median of the observations above M; ¾ of the listed observations (75th percentile)

New cards

IQR

IQR = interquartile range = Q₃ - Q₁ ; spread of the middle half of the data and used to test outliers

New cards

five-number summary

minimum, Q₁, median, Q₃, and maximum; used to describe center and spread of data and to construct box plots

New cards

minimum

smallest observation (may or may not include outliers)

New cards

maximum

largest observation (may or may not include outliers)

New cards

boxplot

graph of the five number summary; box with lines marking the quartiles and median with “whiskers” extending from the quartiles to the min and max; used for side-by-side distribution comparison

New cards

modified boxplot

same as a normal boxplot, but outliers are marked separate points and the whiskers extend to the extremes that are not outliers

New cards

statistic

numerical value summarizing data for the SAMPLE

New cards

parameter

numerical value summarizing data for the entire POPULATION

New cards

standard deviation

spread; describes the average distance of observations from their mean; s for sample and σ for population; s = √variance

New cards

variance

mean of squared deviations; s² = [∑(x-x̄)f] / n OR n-1 ; use n-1 for samples and n for populations

New cards

percentile

position; kth percentile = Pₖ = at most k% of observations fall below the value at Pₖ; (# of scores at or below given score)/(total # of scores); vertical axis of ogive graph

New cards

ogive

graph measuring scores against percentile; make a histogram, then make a line from left to right connecting points on upper right corners and the last point on the lower left

New cards

experiment

planned activity with imposed treatment whose results yield data set (without imposed treatment, it’s a study)

New cards

data

value of variable associated with one element of population or sample

New cards

exploratory data analysis

statistical tools and ideas used to examine data in order to describe their main features

New cards

mean absolute deviation (M.A.D)

(∑|x-x̄|f) / n OR n-1 ; use n for population and n-1 for samples; gives average distance from mean but without direction (like standard deviation but using abs. value to get rid of direction instead of square)

New cards

degrees of freedom

n-1; all deviations but the last (nth) deviation; used to explain why we divide samples by n-1 instead of n (greater margin of error)