Vocab Chapter 1: Exploring Data

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/45

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

46 Terms

1
New cards

individuals

objects described by a set of data (e.g. people, animals, things)

2
New cards

variable

any characteristic of an individual; can be different values for different individuals

3
New cards

categorical/qualitative variable

records which of several categories/groups an individual belongs to; arithmetic is not meaningful

4
New cards

quantitative variable

takes numerical values for which it makes sense to do arithmetic (e.g. adding, averaging)

5
New cards

distribution

pattern of variation of a variable; records values the variable takes and how often it takes them; presentation of data

6
New cards

range

spread; high value - low value; gives interval of scores

7
New cards

spread

describes where data lies in a distribution; measured by range, standard deviation, variance, and/or M.A.D.

8
New cards

frequency

how many times the value of a variable occurs

9
New cards

outlier

an individual observation that falls outside the overall pattern of the graph; determined by eye or using the 1.5 IQR rule: if it’s less than Q₁ - 1.5*IQR or greater than Q₃ + 1.5*IQR, it’s an outlier

10
New cards

center

where the graph is centered; measured by mean, median, and/or mode

11
New cards

shape

the shape observations form in the distribution; described as skewed left/right or symmetric

12
New cards

skewed left

the left side (lower half of the distribution) extends much farther out than the right; the left side is the “tail”

13
New cards

skewed right

the right side (upper half of the distribution) extends much farther out than the left; the right side is the “tail”

14
New cards

symmetric

the right and left sides of the distribution are approximately mirror images of each other

15
New cards

dot plot

graph of data set using dots for each observation

16
New cards

histogram

graph with bars showing frequency of different values of one variable (not categories!); most common for quantitative variables, can use to group nearby values if too many values for a dot plot

17
New cards

stemplot

graph for a small data set that gives more info; stems are all but rightmost digit of observations, leaves are the final digit in decreasing order out from the stem (remember to include a key!)

18
New cards

split stems

each stem appears twice; do if all leaves would fall on just a few stems

19
New cards

back-to-back stemplot

stemplot with leaves on the right and left; use to compare two distributions (don’t forget a key with both distributions!)

20
New cards

time plot

graph plotting each observation against the time at which it was measured; use to show change over time

21
New cards

mean

most common measure of center; (∑x)/n; x̄ for sample mean and μ for population mean

22
New cards

sigma; symbol meaning “sum of”

23
New cards

x̄ (x bar)

sample mean equal to (∑x)/n

24
New cards

nonresistant

sensitive to the influence of extreme observations; because the mean and standard deviation are nonresistant, they are pulled towards the tail

25
New cards

median

the middle value; M = med = x͂ = the (n/2)+1th value (or middle value) in odd functions and = the mean of the middle two values in even functions

26
New cards

resistant

not sensitive to the influence of extreme observations (e.g. median)

27
New cards

quartiles

spread; the quartiles make up the middle half of the data

28
New cards

Q₁

median of the observations below M; ¼ of the listed observations (25th percentile)

29
New cards

Q₃

median of the observations above M; ¾ of the listed observations (75th percentile)

30
New cards

IQR

IQR = interquartile range = Q₃ - Q₁ ; spread of the middle half of the data and used to test outliers

31
New cards

five-number summary

minimum, Q₁, median, Q₃, and maximum; used to describe center and spread of data and to construct box plots

32
New cards

minimum

smallest observation (may or may not include outliers)

33
New cards

maximum

largest observation (may or may not include outliers)

34
New cards

boxplot

graph of the five number summary; box with lines marking the quartiles and median with “whiskers” extending from the quartiles to the min and max; used for side-by-side distribution comparison

35
New cards

modified boxplot

same as a normal boxplot, but outliers are marked separate points and the whiskers extend to the extremes that are not outliers

36
New cards

statistic

numerical value summarizing data for the SAMPLE

37
New cards

parameter

numerical value summarizing data for the entire POPULATION

38
New cards

standard deviation

spread; describes the average distance of observations from their mean; s for sample and σ for population; s = √variance

39
New cards

variance

mean of squared deviations; s² = [∑(x-x̄)f] / n OR n-1 ; use n-1 for samples and n for populations

40
New cards

percentile

position; kth percentile = Pₖ = at most k% of observations fall below the value at Pₖ; (# of scores at or below given score)/(total # of scores); vertical axis of ogive graph

41
New cards

ogive

graph measuring scores against percentile; make a histogram, then make a line from left to right connecting points on upper right corners and the last point on the lower left

42
New cards

experiment

planned activity with imposed treatment whose results yield data set (without imposed treatment, it’s a study)

43
New cards

data

value of variable associated with one element of population or sample

44
New cards

exploratory data analysis

statistical tools and ideas used to examine data in order to describe their main features

45
New cards

mean absolute deviation (M.A.D)

(∑|x-x̄|f) / n OR n-1 ; use n for population and n-1 for samples; gives average distance from mean but without direction (like standard deviation but using abs. value to get rid of direction instead of square)

46
New cards

degrees of freedom

n-1; all deviations but the last (nth) deviation; used to explain why we divide samples by n-1 instead of n (greater margin of error)