STP 420 Midterm #1

0.0(0)

Studied by 0 people

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/53

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

54 Terms

New cards

noise

EXPECTED variability

New cards

signal

variability due to certain CHARACTERISTICS

New cards

cases

objects or subjects described by a data set; WHERE your info came from

ex: 5 students being surveyed on their age/major

New cards

variable

a characteristic of a person/thing that can be assigned a number or category

-ex: age, major

New cards

value

the actual OUTCOMES of that variable (ex: 19 years old, Mathematics BA)

New cards

quantitative variable

a variable that records the AMOUNT of something (numerical) (speed limit, age, etc.)

New cards

categorical variable

records which of several groups or categories an individual belongs to (ex: hair color, zip codes, phone numbers)

New cards

numbers can be categorical variables when

the numbers do not actually have a meaning when on a continuum/not a measurement (ex: phone numbers, zip codes, etc.)

New cards

in a bar chart, the vertical bars are kept

SEPARATE

New cards

bar charts can display frequency or

relative frequency

New cards

relative frequencies should add up to

1.0

New cards

you cannot leave out certain categories in a pie chart because

it would not be a complete circle (100%)

New cards

stemplots can have ______ lines per stem

one OR two

New cards

purpose of having two lines per stem

might need to "zoom" in deeper and break the stem into pieces to see the distribution/spread of data

New cards

stem

ALL but the final digit

New cards

leaf

the final digit

New cards

histogram

a bar graph depicting a frequency distribution (frequency or relative frequency)

New cards

advantage of stem & leaf plot

displays all outcomes

New cards

disadvantage of stem & leaf plot

will not be very informative if the data set is too large (histogram will be more organized)

New cards

single value grouping

each vertical bar of the histogram represents a SINGLE possible value

New cards

single value grouping is rarely used and is best for

a small data set or small range of values

New cards

limit grouping

Use when the data are expressed as whole numbers and there are too many distinct values to employ single-value grouping

New cards

limit grouping only works on _______ variables

DISCRETE

-ex: number of used textbooks bought = [0-3], [4-7], [8-11], [12-15]

New cards

interval width for limit grouping

(upper limit - lower limit + 1)

-ex: the interval [4, 7] actually has a width of 4 (4, 5, 6, 7)

New cards

cutpoint grouping is used on __________ variables

CONTINUOUS

New cards

ranges of cutpoint grouping is expressed as

"[number] to under [number]"

ex: "54 to under 56"

[54, 56)

New cards

discrete variables

values that can be counted; a FINITE number of possible outcomes; integers only

-ex: # of books, # of people, etc.

New cards

continuous variables

can assume an infinite number of values between any two specific values; goes off into DECIMALS

-ex: weight, time, etc.

New cards

Patterns of Data:

1) shape (modality & symmetry/skewness)

2) center (central tendency)

3) spread (SD, range, interquartiles, etc.)

4) outliers

New cards

unimodal distribution

A distribution with one peak

New cards

bimodal distribution

a distribution with two modes

New cards

multimodal distribution

two or more peaks in a distribution curve

New cards

symmetric distribution

a distribution in which the data values are uniformly distributed about the mean; the mean is the best measure of center

New cards

left skew

mean > median

-clusters on the right

-long tail is on the left

New cards

right skew

mean < median

-clusters on the left

-long tail is on the right

New cards

in left and right skewed distributions, the _________ is the best measure of center

MEDIAN

New cards

measures of center

1) mean

2) median

3) mode

New cards

measure of center for categorical data

MODE

New cards

how many times must a value occur in a data set to be a mode?

TWICE

New cards

resistant measure

extreme values have little to no influence on its outcome

-not sensitive; does not respond strongly to outliers or changes in a few observations

New cards

range =

largest observation - smaller observation

-measures SPREAD of data

-NOT a resistant measure

New cards

deviation

difference between an observation and the mean

New cards

the SUM of all deviations from the mean

is always equal to zero

New cards

sample standard deviation

the AVERAGE of all deviations

New cards

sample standard deviation (s) tends to __________ population standard deviation

underestimate

New cards

quartiles

divides a set into 4 equal parts

(Q1, Q2, Q3)

New cards

interquartile range

Q3 - Q1

-RESISTANT

New cards

finding quartiles:

1) rearrange the data in ascending order

2) Q2 = median

3) Q1 = the median of the LOWER HALF of observations

4) Q3 = the median of UPPER HALF of observations

New cards

if you are finding the quartiles on a set with an odd number of observations,

you can either include or NOT include the median, but expect different results. include the median in your upper and lower halfs for exam

New cards

5 Number Summary

1) minimum value

2) Q1

3) Q2/median

4) Q3

5) maximum value

New cards

Outliers

lower fence = Q1 -1.5(IQR)

upper fence = Q3 + 1.5(IQR)

New cards

boxplot

displays distribution of a data set using 5 number summary

New cards

advantage of boxplot

clearly shows outliers and skew

New cards

z score

the number of standard deviations a particular score is from the mean