Stats Exam 1

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/61

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

62 Terms

1
New cards

Numeric/Quantitative

Consists of numbers representing counts or measurements. These data types can be further classified into discrete and continuous variables.

2
New cards

Continuous

When the data can have infinitely many numeric values, where the collection of values is not countable, and can include fractions or decimals.

3
New cards

Discrete

When the data values are numeric and the number of values are countable/finite.
These values often represent whole numbers, such as counts of items or occurrences.

4
New cards

What is an example of a Discrete data measurement

Number of bananas in a bowl

5
New cards

Categorical/Qualitative

Data that consists of names or labels such as Gender, Traits and Names of places on Earth
(They can’t be numbers representing counts or measurements)

6
New cards

Nominal

A type of categorical data that represents different categories or groups without any intrinsic ordering, such as gender, race, or the names of countries.

7
New cards

Ordinal

A type of categorical data that represents categories with a specific order or ranking, such as levels of education or satisfaction ratings.

8
New cards

An example of a Ordinal Data point is

A satisfaction rating, such as on a scale from 1 to 5.

9
New cards

Population

The entire group of individuals or items that we want to draw conclusions about in a statistical study. (can be infinite or finite)

10
New cards

Sample

A subset of the population selected for analysis in a statistical study, used to make inferences about the larger group.

11
New cards

Sample Size

The number of observations or data points included in a sample, essential for achieving accurate and reliable statistical analysis.

12
New cards

Who is the population and the sample in this reading

A research study in 2025 titled “Banana bowl problem solved” talks about the maximum number of bananas a 10 inch diameter bowl could hold. Researchers tested 1000 different bowls filled with various different bananas to find the optimal packing size

The Population is the maximum number of bananas. The Sample is the 1000 bowls

13
New cards

Parameter

A numerical quantity that describes a characteristic of the population.
- This typically talks about everyone, all, or a known fact that doesn’t come from a sample (EX: there are 50 state capitols in the US)
- An example of this would be a proportion of all US residents who support the death penalty

14
New cards

Statistic

A numerical quantity that describes a characteristic of a sample
- This talks about a survey, study, or sample
- An example of this would be talking about a proportion of something, not all

15
New cards

Difference between a Parameter and a Statistic

A Parameter is more general with how it sites its info (talks about ALL)
A Statistic is more precise and concrete with it’s estimation (talks about a GROUP of ALL)
An example of the difference is this

<p>A Parameter is more general with how it sites its info (talks about ALL)<br>A Statistic is more precise and concrete with it’s estimation (talks about a GROUP of ALL)<br>An example of the difference is this </p>
16
New cards

Variable

Any characteristics, number or quantity that can be measured or counted.
(In a data set these are the Columns)

17
New cards

Data

Numerical or qualitative descriptions of the objects that we want to study

18
New cards

Experimental unit/Case

Object or thing that is being tested on/ having data collected
These are generally the Rows in a data test

19
New cards

Simple Random Sampling

knowt flashcard image

A type of sampling that consists of a random amount of individuals that have an equal chance of being selected
- Lottery method

- RNG

20
New cards

Stratified Sampling

A type of sampling that involves dividing a population into similar subgroups (called strata) then randomly selecting samples from each stratum

21
New cards

Cluster Sampling

A type of sampling where a population is divided into groups, then a random selection of these groups are chosen and all get sampled

22
New cards

Systematic Sampling

A type of sampling where the sample members from a population are chosen via selection in a fixed interval

- Example: Select every fourth dog on a park (This is called the sampling interval, calculated by dividing the population size from the desired sample size)

23
New cards

What is the difference between a Experimental Study and a Observational Study?

In a Experimental Study, the researcher manipulates a certain factor or treatment

In a Observational Study, the researcher can only watch without making their sample do things they want

24
New cards

What are the cons of an Observational Study?

25
New cards

Randomization

Experimental units are assigned to random treatments by chance

This is good because it reduces confounding and bias

26
New cards

Difference between being in a Control group and not

The major difference comes in who receives the treatment

Those not in the Control group receive the treatment, while those who are do

27
New cards

Placebo

A treatment that lacks the active ingredient of the treatment being tested

This is used to see the reaction of those who took it vs those who didn’t

28
New cards

Blinding

Participants are not told which treatment they are receiving

This helps reduce bias

29
New cards

Explanatory Variable

The variable that is changed or manipulated by the researcher to observe its effect on an outcome variable.

- It is also known as the independent variable.
- Example: In a study on the effect of fertilizer amount on plant growth, the amount of fertilizer is the explanatory variable.

30
New cards

Confounding Variable

A variable in a study that is related to other variables, affecting the relationship between those variables

- Example: Ice cream sales are associated with drowning

31
New cards

Confounding Variable problems

False reasons for why something is happening
Summer heat = more ice cream sales and drowning is valid
Ice cream sales = more drowning is not valid because Ice cream sales going up doesn’t mean drowning will be more relevant

32
New cards

Lurking Variable

A variable that is not the explanatory or response variable but has a relationship with both
- Example: A scientist studies the effects of a diet and exercise on a person’s blood pressure. The lurking variable that can affect blood pressure are whether a person smokes or is stressed

33
New cards

Selection bias

When some groups are left out of the processes in choosing the sample

34
New cards

Nonresponse

When an individual for a sample isn’t contacted or refuses to participate

35
New cards

Response bias

A systematic pattern of incorrect responses
An example of this Wording of a question (How Amazing was your stay?)

36
New cards

Response Variable

The variable that is measured or observed and is expected to change in response to the manipulation of the explanatory variable.

- It is also known as the dependent variable.

- In a study on the effect of fertilizer amount on plant growth, the plant growth is the response variable.

37
New cards

Block Design

A technique used in experimental designs to reduce the impact of variability by grouping similar cases together

38
New cards

Variation

The primary reason for using blocking when designing an experiment to reduce

39
New cards

Bimodal

Describes a distribution of values that has two individual centers of population

knowt flashcard image

40
New cards

Mean

The average value of all the data
Calculated by adding all the data then dividing it by the number of data points

41
New cards

Median

The midpoint of the distribution

(n+1)/2

42
New cards

How to find the Median

  • If the number of data is odd
    1. The median is the center of the ordered list
    2. It will be a whole number

    3. (n+1)/2 is the position of it

  • If the number of data is even

    1. The median is the average of the two center observations of the ordered list

    2. Will likely not be a whole number

    3. Between n/2 and (n+1)/2 is where it’s located

43
New cards

Quartile

From Min to Max
Min is the minimum value

Q1 is 25% of the data, position is located on the data by calculating (1/4) * (n+1)
Q2 is 50% of the data, also known as the median

Q3 is 75% of the data, position is located on the data by calculating (3/4) * (n+1)
Max is the maximum value

44
New cards

Interquartile range (IQR)

The spread of half your data

Calculated by Q3 - Q1

45
New cards

When to use Mean and Standard Deviation for data

Only for reasonably symmetric distributions that have NO OUTLIERS

46
New cards

When to use Median and Interquartile range for data

When describing a SKEWED distribution or one WITH OUTLIERS

47
New cards

Frequency Table

Creates intervals of values of equal width that cover the all the data along with corresponding frequency in the interval

48
New cards

Class Width

Used in Frequency Tables

Found by calculating (Max - Min)/ n

Make sure to round up or down accordingly to make data cleaner

knowt flashcard image

49
New cards

Histogram

Used to show the distribution of a Quantitative variable by using bars whose height represents the number of individuals

50
New cards

Multimodal

A set of data that has more than 2 values that occur with similar frequency

knowt flashcard image

51
New cards

Uniform

A type of probability distribution in which all the outcomes are equally likely

knowt flashcard image

52
New cards

What graphs to use for Quantitative Variables

  1. Frequency Tables/Distribution

  2. Histograms

  3. Boxplots

  4. Stem and Leaf Plots

  5. Dot Plots

53
New cards

What graphs to use for Qualitative Variables

  1. Frequency Tables/Distributions

  2. Bar Graphs

  3. Pie Charts (Don’t actually use)

54
New cards

Frequency

The number of times each category shows up in a data set (for a table)

55
New cards

Relative Frequency

The fraction of times each category shows up in the data set (for a table)

56
New cards

Cumulative Frequency

The total number of observation up to and including that class
Effectively Relative Frequency but you add the previous values to it

57
New cards

Using Mean and Median to find skewed distributions

In a Skewed distribution,
the Mean is usually Farther out in the tail than the Median

Mean out, Median in
Me go out, Med go in

58
New cards

Independent vs Dependent in Mosaic Plots

  • Independent

    • When the data lines up well, it’s independent

  • Dependent

    • When the data is varied, then it’s dependent

knowt flashcard image

59
New cards

Upper bound

Q3 + 1.5(IQR)

60
New cards

Lower bound

Q1 - 1.5(IQR)

61
New cards

Rule of thumb Upper bound

mean + 2(std dev)

62
New cards

Rule of thumb Lower bound

mean - 2(std dev)