STP 420 Midterm #1

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/53

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

54 Terms

1
New cards

noise

EXPECTED variability

2
New cards

signal

variability due to certain CHARACTERISTICS

3
New cards

cases

objects or subjects described by a data set; WHERE your info came from

ex: 5 students being surveyed on their age/major

4
New cards

variable

a characteristic of a person/thing that can be assigned a number or category

-ex: age, major

5
New cards

value

the actual OUTCOMES of that variable (ex: 19 years old, Mathematics BA)

6
New cards

quantitative variable

a variable that records the AMOUNT of something (numerical) (speed limit, age, etc.)

7
New cards

categorical variable

records which of several groups or categories an individual belongs to (ex: hair color, zip codes, phone numbers)

8
New cards

numbers can be categorical variables when

the numbers do not actually have a meaning when on a continuum/not a measurement (ex: phone numbers, zip codes, etc.)

9
New cards

in a bar chart, the vertical bars are kept

SEPARATE

10
New cards

bar charts can display frequency or

relative frequency

11
New cards

relative frequencies should add up to

1.0

12
New cards

you cannot leave out certain categories in a pie chart because

it would not be a complete circle (100%)

13
New cards

stemplots can have ______ lines per stem

one OR two

14
New cards

purpose of having two lines per stem

might need to "zoom" in deeper and break the stem into pieces to see the distribution/spread of data

15
New cards

stem

ALL but the final digit

16
New cards

leaf

the final digit

17
New cards

histogram

a bar graph depicting a frequency distribution (frequency or relative frequency)

18
New cards

advantage of stem & leaf plot

displays all outcomes

19
New cards

disadvantage of stem & leaf plot

will not be very informative if the data set is too large (histogram will be more organized)

20
New cards

single value grouping

each vertical bar of the histogram represents a SINGLE possible value

21
New cards

single value grouping is rarely used and is best for

a small data set or small range of values

22
New cards

limit grouping

Use when the data are expressed as whole numbers and there are too many distinct values to employ single-value grouping

23
New cards

limit grouping only works on _______ variables

DISCRETE

-ex: number of used textbooks bought = [0-3], [4-7], [8-11], [12-15]

24
New cards

interval width for limit grouping

(upper limit - lower limit + 1)

-ex: the interval [4, 7] actually has a width of 4 (4, 5, 6, 7)

25
New cards

cutpoint grouping is used on __________ variables

CONTINUOUS

26
New cards

ranges of cutpoint grouping is expressed as

"[number] to under [number]"

ex: "54 to under 56"

[54, 56)

27
New cards

discrete variables

values that can be counted; a FINITE number of possible outcomes; integers only

-ex: # of books, # of people, etc.

28
New cards

continuous variables

can assume an infinite number of values between any two specific values; goes off into DECIMALS

-ex: weight, time, etc.

29
New cards

Patterns of Data:

1) shape (modality & symmetry/skewness)

2) center (central tendency)

3) spread (SD, range, interquartiles, etc.)

4) outliers

30
New cards

unimodal distribution

A distribution with one peak

31
New cards

bimodal distribution

a distribution with two modes

32
New cards

multimodal distribution

two or more peaks in a distribution curve

33
New cards

symmetric distribution

a distribution in which the data values are uniformly distributed about the mean; the mean is the best measure of center

34
New cards

left skew

mean > median

-clusters on the right

-long tail is on the left

35
New cards

right skew

mean < median

-clusters on the left

-long tail is on the right

36
New cards

in left and right skewed distributions, the _________ is the best measure of center

MEDIAN

37
New cards

measures of center

1) mean

2) median

3) mode

38
New cards

measure of center for categorical data

MODE

39
New cards

how many times must a value occur in a data set to be a mode?

TWICE

40
New cards

resistant measure

extreme values have little to no influence on its outcome

-not sensitive; does not respond strongly to outliers or changes in a few observations

41
New cards

range =

largest observation - smaller observation

-measures SPREAD of data

-NOT a resistant measure

42
New cards

deviation

difference between an observation and the mean

43
New cards

the SUM of all deviations from the mean

is always equal to zero

44
New cards

sample standard deviation

the AVERAGE of all deviations

45
New cards

sample standard deviation (s) tends to __________ population standard deviation

underestimate

46
New cards

quartiles

divides a set into 4 equal parts

(Q1, Q2, Q3)

47
New cards

interquartile range

Q3 - Q1

-RESISTANT

48
New cards

finding quartiles:

1) rearrange the data in ascending order

2) Q2 = median

3) Q1 = the median of the LOWER HALF of observations

4) Q3 = the median of UPPER HALF of observations

49
New cards

if you are finding the quartiles on a set with an odd number of observations,

you can either include or NOT include the median, but expect different results. include the median in your upper and lower halfs for exam

50
New cards

5 Number Summary

1) minimum value

2) Q1

3) Q2/median

4) Q3

5) maximum value

51
New cards

Outliers

lower fence = Q1 -1.5(IQR)

upper fence = Q3 + 1.5(IQR)

52
New cards

boxplot

displays distribution of a data set using 5 number summary

53
New cards

advantage of boxplot

clearly shows outliers and skew

54
New cards

z score

the number of standard deviations a particular score is from the mean