STAT 100 Exam 1 (unit 1 & 2)

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/45

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 9:13 PM on 4/7/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

46 Terms

1
New cards

Statistics

The science of variability.

2
New cards

What are the 3 short statistical sayings?

  • What was compared?

  • Who’s not here?

  • Incorporate ā€œishā€-ness

3
New cards

Data

A representation of someone or something.

4
New cards

Tidy Data

A way of mapping the real world to a dataset.

  • Observations & attributes

5
New cards

Observations

Map to rows, & are the things we are interested in.

6
New cards

Attributes

Map to columns, & are the pieces of information we are interested in.

7
New cards

Measures

The way in which we collect information about observations.

  • Quantitive, categorical, rating scale, or time series data

8
New cards

Quantitative Data

Refers to data in which the values of an attribute for an observation are numbers representing a quantity of something.

9
New cards

Categorical Data

Refers to data in which the values of an attribute for an observation are selected from a set of different category labels.

10
New cards

Rating Scale Data (Ordinal)

A special type of categorical data that refers to data in which the values of an attribute for an observation are selected form a predetermined rating scale.

11
New cards

Time Series Data

Refers to data in which the values of an attribute for an observation indicate a moment in time (such as year, month, or day).

12
New cards

Reliability

Refers to the extent to which the data you collect from a measure truly represents & reflects the real world characteristics of the observation.

13
New cards

Data Validation

The act of ensuring that the values collected from each observation for each attribute are reliable.

14
New cards

Univariate Analysis

The analysis of a single attribute or variable at a time.

15
New cards

Standard Deviation

A measure that tells us how spread out observations are from one another.

  • middle 95%/4 = SD

16
New cards

Five Number Summary

Contains five numbers that help statisticians & data scientists understand the different values that the different observations have for an attribute.

  • min, Q1, median, Q3, max

17
New cards

Minimum

The smallest value that any observation has for the attribute.

18
New cards

First Quartile (Q1)

The 25th percentile value. 25% of observations have a value below the first quartile.

19
New cards

Median (Q2)

The 50th percentile value. Half of all observations have a value below the median.

20
New cards

Third Quartile (Q3)

The 75th percentile value. 75% of all observations have a value below the third quartile.

21
New cards

Maximum

The largest value that any observation has for the attribute.

22
New cards

Frequencies

The total number of observations whose response is equal to a particular value.

23
New cards

Relative Frequencies

The percentage of all observations without missing values whose value is equal to a particular response.

24
New cards

Dot Plot

A graph where each observation is displayed as one point on the graph.

  • looking at the height tells you the frequency for a particular value.

25
New cards

Density Plot

Very similar to a dot plot with a line drawn across the top of all of the stacks of dots.

26
New cards

Word Cloud

Graphically depicts all of the words across all of the responses from all of the observations. The size of the word varies by the frequency of the word.

27
New cards

Bar Graph

Based on a frequency table, & has one bar for every response option.

  • The height is equal to each response option’s frequency

28
New cards

Aggregate Characteristics

The characteristics of a group of observations.

29
New cards

For text, rating scale, & categorical data, what are the main aggregate characteristics we focus on?

The frequencies & percentages of each of the response options.

30
New cards

For quantitative data, what are the main aggregate characteristics we focus on?

  • Shape of the data

  • Spread of the data

  • Location of the data

31
New cards

Distribution

The pattern that the responses from all the observations make.

32
New cards

What are the many different common shapes that we often see in quantitative distributions?

  • Normal distribution

  • Skewed distribution

  • Multi-Modal distribution

33
New cards

Normal Distribution

A bell-curve shape

  • Most of the observations have a value near the average

  • Approximately 95% of the observations will have a value within two standard deviations of the mean

34
New cards

Skewed Distributions

Looks like they’ve had one side stretched out.

  • Right skew distributions look like the right side of a normal distribution has been stretched, which indicates that some units have very large values.

  • Left skew distributions look like the left side of a normal distribution has been stretched, which indicates that some units have very small values.

35
New cards

Multi-Modal Distributions

Multiple peaks.

  • Often seen when there are actually group differences in the attribute.

36
New cards

Key to think statistically:

Focusing on how each observation varies.

37
New cards

Two-Way Table

Similar to a frequency table, except that one attribute’s frequencies are presented as different rows in the table, & a second attribute’s frequencies are presented as columns.

38
New cards

Column Percents

Relative frequencies based only on the total from a single column.

  • Statisticians use column percents to compare the distribution of a categorical or rating scale attribute between two groups.

39
New cards

Line Graph

Places time on the horizontal (x) axis, & the average value of an attribute or a percentage on the vertical axis.

40
New cards

Ratio of Standard Deviation

Equal to the largest standard deviation between the two groups divided by the smallest standard deviation.

  • used to compare the spread of the distribution between two groups

  • If the ratio is approximately 3 or higher, we say that one distribution is more spread out than the other.

41
New cards

Effect Size

Equal to the difference in the means, divided by the larger standard deviation.

  • Used to compare the relative difference in the means between two groups.

  • 0.10 or less = no difference in the averages between groups.

  • 0.25 or less = small difference in the averages between groups.

  • 0.75 or more = large difference in the averages between groups.

42
New cards

Grouped Bar Graph

Has one bar graph created separately for each of the different response options for the second attribute being considered in the two-way table.

43
New cards

Grouped Density Plots

Have one density curve for a quantitative attribute for each group all on the same plot.

44
New cards

Scatter Plot

A plot in which each observation is placed as a point on a graph according to their value for each of the two quantitative attributes.

  • usually the causal factor goes on the horizontal (x) axis

45
New cards

Smoothed Trend Line

A line through the average value of the vertical axis attribute across all values of the horizontal axis attribute.

46
New cards

Correlation Coefficient

A statistic that summarizes how related two attributes are to each other.

  • Close to 0 = no association between the attributes.

  • Close to -1 = strong negative association between the attributes.

  • Close to +1 = strong positive association between the attributes.