Statistics 1: Definitions

0.0(0)
studied byStudied by 6 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/96

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

97 Terms

1
New cards

Statistic

information from a sample (subset of a population)

2
New cards

Parameter

the summary of a population

3
New cards

Descriptive statistics

organizing and summarizing data (numerical summaries, tables, graphs, etc)

4
New cards

Inferential Statistics

take results from a sample (descriptive portion) and sees how applies to the population; measures reliability

5
New cards

Qualitative/Categorical variables

characteristics or attributes (not usually numerical)

6
New cards

Quantitative variables

numerical measures; can be added or subtracted

7
New cards

Discrete variable

countable, limited possibilities (ex: the number of students in a class, cannot be partial)

8
New cards

Continuous variables

continuous, infinite possible values, any level of accuracy (ex: height, weight)

9
New cards

Nominal level of measurement

the name of an item

10
New cards

Ordinal level of measurement

items are arranged in a specific order

11
New cards

Interval level of measurement

usually numerical, differences between items, addition or subtraction make sense, zero doesn’t mean the absence of quantity (ex: temperature)

12
New cards

Ratio level of measurement

accounts for factors, multiplication and division make sense, zero does mean the absence of quantity. (ex: speed)

13
New cards

Observational Study

observing a group of individuals (no intervention) over time and drawing a conclusion (ex: unethical studies)

14
New cards

Designed Experiment

organizing and manipulating a group of individuals and records the value of the response variable

15
New cards

Response variable

the response to the experiment (dependent variable)

16
New cards

Explanatory variable

what causes the response (independent variable)

17
New cards

Confounding

when the effects of two or more explanatory variables are not separated, so the result doesn’t imply causation in experiment

18
New cards

Confounding variable

an explanatory variable that cannot be separated from the independent variable but impacts experimental results

19
New cards

Lurking variables

not considered in a study but impacts the response

20
New cards

Simple random sampling

pre-determining the individuals that you are selecting without seeing them

21
New cards

Random

every individual has an equal chance of being selected

22
New cards

Frame

a list of all individuals in the population

23
New cards

Systematic sample

select sample members from a larger population at regular intervals, starting from a randomly chosen point (no frame)

24
New cards

Stratified sample

separating the population into groups (nonoverlapping) that contain similar people, obtaining a simple random sample from each group

25
New cards

Cluster sample

selecting all individuals within a random collection of groups

26
New cards

Sampling without replacement

once an individual is chosen, they cannot be chosen again

27
New cards

Sampling with replacement

once an individual is chosen, they can be chosen again (go back into the pool)

28
New cards

Cross-sectional studies

observational study at a specific point in time

29
New cards

Case-control studies

observational study that is retrospective (looking back at previous actions compared to now)

30
New cards

Cohort Studies

observational study that follows a large group for a period of time (prospective = future)

31
New cards

Bias

if the sample is not representative of the population

32
New cards

Sampling bias

when sampling tends to favor one part of the population leading to undercoverage/overcoverage of some groups

33
New cards

Nonresponse bias

when people don’t respond to a survey leading to missing possible data

34
New cards

Response bias

when people on a survey are not honest

35
New cards

Response bias: Interview error

interviewer must be trained to get truthful responses

36
New cards

Response bias: Misrepresented answers

questions result in responses that are untrue

37
New cards

Response bias: Wording of questions

questions must be balanced and worded neautrally

38
New cards

Response bias: Order of questions

responses that are affected by prior questions

39
New cards

Response bias: types of questions

open allows the respondent to choose, closed limits the respondents choice

40
New cards

Response bias: data entry error

type of nonsampling error, error in recording

41
New cards

Nonsampling error

result of undercoverage, nonresponse bias, response bias, or data-entry error

42
New cards

Sampling error

using a sample that doesn’t accurately represent the population and occurs because the sample gives incomplete information about a population

43
New cards

Treatment

any combination of values of the factors of an experiment

44
New cards

Experimental unit/subject

well-defined item upon which a treatment is applied

45
New cards

Control group

baseline treatment, used to compare

46
New cards

Placebo

something that mimics the treatment, but doesn’t actually include the treatment, used to filter out personal bias

47
New cards

Blinding

nondisclosure of treatment

48
New cards

Single-blind experiment

participant doesn’t know if their getting a placebo or the treatment

49
New cards

Double-blind experiment

neither the participant nor the researcher knows what the participant is receiving

50
New cards

Raw data

data that is not organized

51
New cards

Frequency distribution

list of each category of data and the # of occurrences for each (a count)

52
New cards

Relative frequency

percent of observations within a category (frequency/sum of all frequencies)

53
New cards

Relative frequency distribution

lists each category of data with relative frequency

54
New cards

Bar graph

graphical representation of a frequency distribution

55
New cards

Pareto chart

a bar graph where bars are drawn in order of frequency or relative frequency

56
New cards

Side-by-side bar graph

compares data for two different time zones, should use relative frequencies bc of different population sizes

57
New cards

Pie chart

sectors are proportional to frequencies of the categories

58
New cards

Classes

categories of data

59
New cards

Lower class limit

smallest value within class

60
New cards

Upper class limit

largest value in class

61
New cards

Class width

difference between consecutive lower class limits

62
New cards

Histogram

bar graph where bars are connected, implies a connection between data

63
New cards

Convenience sampling

individuals in the sample are easily obtained

64
New cards

Self-selected/voluntary responses

self-explanatory, participants may not be telling the truth

65
New cards

Multistage sampling

using more than one sampling method in large-scale surveys

66
New cards

Class midpoint

the sum of the consecutive lower class limits divided by 2

67
New cards

Cumulative frequency distribution

total number of observations that are less than or equal to the category (running count of all data)

68
New cards

Cumulative relative frequency distribution

percentage of observations less than or equal to the category (running count of percent of data)

69
New cards

Time series data

if the value of a variable is measured at different points in time

70
New cards

Uniform distribution

frequency of each value are evenly distributed (straight across)

71
New cards

Bell-shaped distribution

highest frequency is in the middle and tail off to the right and left (equally)

72
New cards

Skewed right distribution

tail to the right is longer than the tail to the left

73
New cards

Skewed left distribution

tail to the left is longer than the tail to the right

74
New cards

Dispersion

the degree to which the data is spread out

75
New cards

Population standard deviation

the square root of the sum of squared deviations about the population mean divided by the number of observations in the population (N)

  • larger = more varied

  • smaller = less varied

76
New cards

Sample standard deviation (s)

the square root of the sum of squared deviations about the sample mean divided by n – 1, where n is the sample size

77
New cards

Range

Difference between max and min values

78
New cards

Variance

square of the standard deviation

79
New cards

Empirical rule (for bell-shaped curves)

  • 68% of data will fall within 1 standard deviation of the mean

  • 95% of data will fall within 2 standard deviations of the mean

  • 99.7% of the data will fall within 3 standard deviations of the mean

80
New cards

Chebyshev’s Inequality

guarantees only 1/K² values will be found within a specific distance from the mean of a distribution

<p>guarantees only 1/K² values will be found within a specific distance from the mean of a distribution</p>
81
New cards

Z-score

(data point - mean)/standard deviation

82
New cards

Percentile

P(k) = percent of observations less than or equal to k

83
New cards

Quartiles

Q1 = 25% of the data is less than this = 25th percentile

Q2 = 50% of the data is less than this = 50th percentile

Q3 = 75% of the data is less than this = 75th percentile

84
New cards

IQR

middle 50% of observations -> Q3 - Q1

85
New cards

Fences

cutoff values for determining outliers

    Upper fence: Q1 - 1.5(IQR)

    Lower fence: Q3 + 1.5(IQR)

86
New cards

5 Number Summary

min, Q1, M, Q3, max

87
New cards

Boxplot

Number line long enough to include max and min values with vertical lines at Q1, M, and Q3

  • Upper and lower fences labeled

  • Whiskers: lines from Q1 to smallest value and Q3 to largest value minus the outliers

  • Outliers marked with asterisk

  • Median is in the middle of box if data is not skewed

88
New cards

Explanatory Linear (positive)

increase in x -> increase in y

89
New cards

Explanatory Linear (negative)

increase in x -> decrease in y

90
New cards

Explanatory Nonlinear

some pattern, but not linear

91
New cards

Explanatory No Relation

almost random

92
New cards

Positive association

increase in x -> increase in y

93
New cards

Negation association

increase in x -> decrease in y

94
New cards

Linear correlation coefficient + rules

measure of strength and direction of the relationship of two variables

  1. The linear correlation coefficient is always between –1 and 1, inclusive. That is, –1 ≤ r ≤ 1.

    2.  If r = + 1, then a perfect positive linear relation exists between the two variables.

    3. If r = –1, then a perfect negative linear relation exists between the two variables.

    4. The closer r is to +1, the stronger is the evidence of positive association between the two variables.

    5. The closer r is to –1, the stronger is the evidence of negative association between the two variables.

95
New cards

Line of best fit

a line which is drawn from two points that best express the data

96
New cards

Residual

the difference between the observed value of y and the predicted value of y

97
New cards

Scope of the model

the range of values that the data set applies to based on what makes sense