unit 1 stats

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/77

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

78 Terms

1
New cards

statistics

the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. Providing a measure of confidence in any conclusions

2
New cards

data

a fact or proposition used to draw a conclusion or make a decision. Describes characteristics of an individual

3
New cards

population

the entire group of individuals to be studied

4
New cards

individual

a person or object that is a member of the population being studied.

5
New cards

sample

a subset of the population that is being studied

6
New cards

statistic

numerical summery based on a sample

7
New cards

descriptive statistics

consists of organizing and summarizing data. Describe data through numerical summaries, tables, and graphs

8
New cards

inferential statistics

uses methods that take results from a sample, extends them to the population, and measures the reliability of the result

9
New cards

parameter

a numerical summary of a population

10
New cards

process of statistics

1. identify the research objective
2. collect the data needed to answer the question posed in
3. describe the data
4. perform inference

11
New cards

variables

characteristics of the individuals within the population

12
New cards

qualitative or categorical variables

allow for classificiation of individuals based on some attribute or characteristic

13
New cards

quantitative variables

provide numerical measures of individuals. The values of a quantitative variable can be added or subtracted and provide meaningful results

14
New cards

discrete variable

a quantitative variable that has either a finite number of possible values or a countable numner of possible values.

15
New cards

discrete variable characteristics

- countable (0,1, 2, 3)
- cannot take on every possible value between any two possible values

16
New cards

continuous variable

a quantitative variable that has an infinite number of possible values that it can take on and can be measured to any desired level of accuracy

17
New cards

raw data

data that is not organized

18
New cards

ways to organize data

tables
graphs
numerical summeries

19
New cards

frequency distribution

lists each category of data and the number of occurrences for each category of data

20
New cards

relative frequency

the proportion or percent of observations within a category

21
New cards

relative frequency formula

frequency/ sum of all frequencies

22
New cards

relative frequency distribution

lists each category of data with the relative frequency

23
New cards

bar graph

constructed by labelling each category of data on either the H or V axis and the frequency or relative frequency on the other. Rectangles of equal width drawn.

24
New cards

pareto chart

bar graph where the bars are drawn in decreasing order of frequency or relative frequency

25
New cards

side-by-side bar graphs

used to compare data sets. Comparisons are made using relative frequencies to deter confusion caused by different sample/pop sizes

26
New cards

horizontal bar graphs

preferable when category names are lengthy

27
New cards

pie chart

a circle divided into sectors, each representing a category of data

28
New cards

histogram

constructed by drawing rectangles for each class of data. Height is the frequency or relative frequency of the class. Width is equal and bars touch

29
New cards

classes

categories into which data and grouped.

30
New cards

lower class limit

smallest vaue within the class

31
New cards

upper class limit

the largest value within the class

32
New cards

class width

difference between consecutive lower class limits

33
New cards

determining class width

(largest data value - smallest data value)/number of classes

34
New cards

dot plot

drawn by placing each observation horizontally in increasing order and placing a dot above the observation each time it is observed

35
New cards

uniform distribution

the frequency of each value of the variable is evenly spread out across the values ot the variable

36
New cards

bell-shaped distribution

the highest frequency occurs in the middle and frequencies tail off to the left and right of the middle

37
New cards

skewed right

the tail to the right of the peak is longer than the tail to the left of the peak

38
New cards

skewed left

tail to the left of the peak is longer than the tail to the right of the peak

39
New cards

time series data

if the value of a variable is measured at different points of time

40
New cards

time-series plot

obtained by plotting the time in which a variable is measured on the horizontal axis and the corresponding value of the variable is on the vertical axis. Line segments connect the points

41
New cards

arithmetic mean

computed by adding all the values of the variable in the data set and dividing by the number of observations

42
New cards

population mean is a

parameter

43
New cards

median (M)

value that lies in the middle of the data when arranged in ascending order

44
New cards

resistant

Extreme values (very large or small) relative to the data do not affect its value substantially

45
New cards

mode

the most frequent observation of the variable that occurs in the data set

46
New cards

when is range used

on the news or when talking of housing prices

47
New cards

skewed left distribution

mean < median < mode. Mean substantially smaller than median

48
New cards

skewed right distribution

mean > median > mode. Mean substantially larger than median

49
New cards

population standard deviation

the square root of the sum of squared deviations about the population mean divided by the # of observations in the population N

50
New cards

standard deviation percentages

34%, 13.5%, 2.35%, 0.15% (in half, x2 to equal 100)

51
New cards

symmetric distribution

mean roughly equally to median.

52
New cards

no mode

no observation occurs more than once

53
New cards

range (R)

the difference between the largest data value and the smallest data value

54
New cards

kth percentile (Pk)

a value such that k percent of the observations are less than or equal to the value

55
New cards

Interquartile Range (IQR)

the range of the middle 50% of the observations in a data set

56
New cards

fences

serve as cut off points for determining outliers.

57
New cards

lower fence formula

Q1 - 1.5(IQR)

58
New cards

upper fence formula

Q3 + 1.5(IQR)

59
New cards

five-number summary

consists of minimum, Q1, the median, Q3, and the largest data value

60
New cards

response variable (y)

variable whose value can be explained by the value of the explanatory or predictor variable

61
New cards

scatter diagram

graph that shows the relationship between two variables

62
New cards

variance

square of the standard deviation

63
New cards

the empirical rule

68% of data lies within 1 standard deviation.
95% of the data will lie within 2 standard deviations of the mean.
99.7% of the data lies within 3 standard deviations of the mean.
100% of the data lies within 4 standard deviations of the mean.

64
New cards

z-score

represents the distance that a data value is from the mean in terms of the number of standard deviations.

65
New cards

quartiles

divide data sets into fourths, or four equal parts

66
New cards

response variable

the dependent variable and is plotted on the vertical axis of a scatter diagram

67
New cards

positively associated

whenever the value of one variable increases, the value of the other variable also increases

68
New cards

negatively associated

two variables are negatively associated if, whenever the value of one variable increases, the value of the other variable decreases

69
New cards

explanatory variable

independent variable, plotted on the horizontal axis

70
New cards

correlation coefficient

a measure used to describe the strength and direction of a relationship between variables whose data points lie on or near a line

71
New cards

positive correlation range

r = 0 to 1

72
New cards

negative correlation range

r = -1 to 0

73
New cards

positive correlation

when one variable increases or decreases, the other one will also do the same

74
New cards

negative correlation

when one variable increases, the other will decrease and vice versa

75
New cards

properties of linear correlation coefficient

1. -1 to 1
2. r = 1, perf positive linear relation
3. r = -1, perf negative linear relation
4. r = 0, no LINEAR correlation
5. closer to -1, stronger the neg correlation
6. closer to 1, stronger the positive correlation
7. correlation coefficient is NOT RESISTANT
8. unit-less measure of association. Unit measure for x & y plays no role in interpretation of of r

76
New cards

least-squares regression

allows you to find a linear equation that describes the relation between 2 variables

77
New cards

residual

the difference between an observed value of the response variable and the value predicted by the regression line

78
New cards

what does each point on the least-squares regression line represent

each point represents the predicted y-value at the corresponding value of x