UCI Stats 7 Terms + Formulas

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/67

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

68 Terms

1
New cards

Statistics

a collection of procedures and principles for gathering data and analyzing information to help people make decisions when faced with uncertainty

2
New cards

Dotplot

graphs a dot for each data value on a number line. easy to see individual data values, easy to make, but gets cluttered with large sample size

<p>graphs a dot for each data value on a number line. easy to see individual data values, easy to make, but gets cluttered with large sample size</p>
3
New cards

Summary Statistics

statistics that summarize a great deal of numerical information about a distribution, such as the mean and the standard deviation

4
New cards

Five-Number Summary

minimum, Q1, median, Q3, maximum

5
New cards

Data

plural word referring to numbers or non-numerical labels collected from a set of entities (people, cities, etc)

6
New cards

Median

middle value of a numerical list

7
New cards

Lower Quartile

median of the lower half of a numerical list

8
New cards

Upper Quartile

median of the upper half of a numerical list

9
New cards

Rate

the number of times something occurs per number of opportunities for it to occur

10
New cards

risk

likelihood of a bad outcome that can be estimated using the past rate for that outcome

11
New cards

base rate/baseline risk

the rate/risk at a beginning time period or under specific conditions

12
New cards

Random Sample

subset of the population selected so that every individual has a specified probability of being part of the sample

13
New cards

Population of Interest

collection of all individuals about which information is desired

14
New cards

Sample Survey

a survey where investigators gather opinions or other information from each individual included in the sample

15
New cards

Poll

investigators gather opinions or other information from each individual included in the sample

16
New cards

Margin of Error

a number added to and subtracted from the sample information to produce an interval that is 95% certain to contain the true value for the population

17
New cards

Margin of Sampling Error

margin of error in polls, term used to distinguish it from other sources of errors and biases that can distort results

18
New cards

Nonparticipation bias (nonresponse bias)

many people who are selected for the sample do not respond to key survey questions or at all. people who actually participate are those who feel strongly about issues.

19
New cards

Self-Selected Sample (Volunteer Sample)

sample size chosen by people who want to do it, not randomly

20
New cards

Observational Study

a study in which participants are merely observed and measured

21
New cards

Variable

a characteristic that differs from one individual to the next. may be numerical or categorical

22
New cards

Confounding variables

variable that is not the main concern of the study but may be partially responsible for the observed results

23
New cards

Randomized Experiment

study in which treatments are randomly assigned to participants

24
New cards

Treatment

specific regimen or procedure assigned to participants by the experimenter

25
New cards

Random Assignment

each participant has a specified probability of being assigned to each treatment

26
New cards

Placebo

pill or treatment designed to look like active treatment but with no active ingredients

27
New cards

Statistically Significant

a difference large enough to be unlikely to have occurred in the sample if there was no relationship or difference in population. does not necessarily have practical significance or importance

28
New cards

Practical Significance (Practical Importance)

a statistically significant difference that actually matters greatly

29
New cards

Multiple Testing (Multiple Comparisons)

refers to the fact that researchers often test many different hypotheses in the same study

30
New cards

false positive (data snooping)

when researchers do multiple comparisons, they can get statistically significant findings by mistake

31
New cards

Process of Discovery

1. asking the right questions

2. collecting useful data, which includes deciding how much is needed

3. summarizing and analyzing data, with the goal of answering the questions

4. making decisions and generalizations based on observed data

5. turning the data and subsequent decisions into new knowledge

32
New cards

Raw Data

numbers and category labels that have been collected but have not yet been processed in any way

33
New cards

Observational Unit

single individual entity (ex. a person) in a study

34
New cards

Observation

individual measurement of an observational unit

35
New cards

Sample Size

total number of observational units

36
New cards

Dataset

complete set of raw data

37
New cards

Census

data is collected from all members of a population

38
New cards

Sample Data

collected when measurements are taken from a subset of a population

39
New cards

Population Data

collected when all individuals in a population are measured

40
New cards

Statistic

summary measure of sample data

41
New cards

Parameter

summary measure of population data

42
New cards

Categorical Variable

data consisting of group or category names. no logical ordering

43
New cards

Ordinal Variable

categorical variables that can be ordered (ex. drink sizes from small to large)

44
New cards

Quantitative Variable (Measurement Variable/Numerical Variable)

data consisting of numerical measurements or counts. does not include numbers that do not follow an order (ex. Social Security numbers)

45
New cards

Continuous variable

every value within some interval is a possible response. does not skip numbers, even the ones with really long and ugly decimals

46
New cards

Explanatory Variable

independent variable (x) helps explain response variable but does not always have a causal relationship

47
New cards

Response Variable (Outcome Variable)

dependent variable (y)

48
New cards

Distribution

describes how often possible responses occur

49
New cards

Frequency distribution

used for categorical variables, lists frequencies (how often it occurs) for all categories

50
New cards

Relative frequency distribution

lists categories similar to a frequency distribution but counts by percentages/proportions

51
New cards

Pie Chart

used for a single categorical variable if there are not too many categories

<p>used for a single categorical variable if there are not too many categories</p>
52
New cards

Bar Graphs

summarizes one or two categorical variables, useful for making comparisons for two variables

53
New cards

Three Summary Characteristics

location, spread, shape

54
New cards

Location

describes the center, average (either mean or median)

55
New cards

Spread

describes variability (either standard deviation or IQR)

56
New cards

Shape

how the graph is shaped

57
New cards

Outlier

values that are unusually large or small

<p>values that are unusually large or small</p>
58
New cards

Histogram

similar to bar graph, can be used for any number of data values, good for large sets of data, flexibility with intervals, not informative when sample size is small

59
New cards

Stem-and-leaf plot

present all individual values, bad for large sample sizes, restricted in choices for intervals

60
New cards

boxplot (box and whisker plot)

displays information given in a five-number summary, good for seeing location, spread, symmetry vs skewed, outliers, and comparing. not good for judging shape.

<p>displays information given in a five-number summary, good for seeing location, spread, symmetry vs skewed, outliers, and comparing. not good for judging shape.</p>
61
New cards

Skewed to the right

description of a shape where data values are concentrated at the left of the graph

<p>description of a shape where data values are concentrated at the left of the graph</p>
62
New cards

Skewed to the left

description of a shape where data values are concentrated at the right of the graph

<p>description of a shape where data values are concentrated at the right of the graph</p>
63
New cards

Mode

most frequent value in a data set

64
New cards

Unimodal

one peak in the graph

65
New cards

Bimodal

two peaks in the graph

66
New cards

Percentile

number that has __% of the data values at or below it

67
New cards

Empirical Rule

68% w/in 1 standard deviation 95% in 2 standard deviations and 99.7% in 3 standard deviations

68
New cards

Z-score

a measure of how many standard deviations you are away from the norm (average or mean)

<p>a measure of how many standard deviations you are away from the norm (average or mean)</p>