Stats: Basic rules, definitions, principles and skills

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/51

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:03 PM on 6/26/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

52 Terms

1
New cards
<p>Explain the following formula</p>

Explain the following formula

Basic probability.

  • P(A) = probability of event A

  • n(A) = number of outcomes in event A

  • n(S) = number of outcomes in the sample space (total possible outcomes)

2
New cards
<p>Explain the following formula</p>

Explain the following formula

Conditional probability (probability of B given A has happened).

  • P(B|A) = probability of B occurring given A has occurred

  • P(B∩A) = probability of both A and B occurring

  • P(A) = probability of A occurring

3
New cards
<p>Explain the following formula</p>

Explain the following formula

Probability of A or B (union).

  • P(A∪B) = probability A or B (or both) occurs

  • P(A∩B) = probability both A and B occur (subtracted so you don't double-count the overlap)

4
New cards
<p>Explain the following formula</p>

Explain the following formula

Permutations (order matters).

  • n = total number of items

  • r = number of items being arranged/chosen

  • n! = n factorial (n×(n-1)×(n-2)×...×1)

5
New cards
<p>Explain the following formula</p>

Explain the following formula

Combinations (order doesn't matter).

  • n = total number of items

  • r = number of items being chosen

  • Same n! logic as above, just divided by r! extra to remove order

6
New cards
<p>Explain the following formula</p>

Explain the following formula

Binomial probability.

  • X = the random variable (number of successes)

  • x = a specific number of successes you're solving for

  • n = number of trials

  • p = probability of success on a single trial

  • (1-p) = probability of failure on a single trial

7
New cards
<p>Explain the following formula</p>

Explain the following formula

Hypergeometric probability.

  • R = random variable (number of "successes" drawn)

  • r = specific number of successes you want

  • N = total population size

  • p = number of "successes" in the entire population (careful: this p is a count, not a proportion like in binomial)

  • n = sample size drawn

8
New cards
<p>Explain the following formula</p>

Explain the following formula

Mean (expected value) of a binomial distribution.

  • E[X] = expected value/mean

  • n = number of trials, p = probability of success

9
New cards
<p>Explain the following formula</p>

Explain the following formula

Variance of a binomial distribution.

  • Var[X] = variance

  • Same n, p, (1-p) as above

10
New cards
<p>Explain the following formula</p>

Explain the following formula

Standardizing a single value (z-score).

  • X = the individual data value

  • μ = population mean

  • σ = population standard deviation

11
New cards
<p>Explain the following formula</p>

Explain the following formula

Standardizing a sample mean.

  • x̄ = sample mean

  • μ = population mean

  • σ/√n = standard error of the mean (n = sample size)

12
New cards
<p>Explain the following formula</p>

Explain the following formula

Comparing two sample means.

  • x̄, ȳ = sample means of group x and group y

  • μx, μy = population means of group x and group y

  • σx², σy² = population variances of each group

  • nx, ny = sample sizes of each group

13
New cards
<p>Explain the following formula</p>

Explain the following formula

Confidence interval for a mean.

  • x̄ = sample mean

  • z = z-score for your confidence level

  • σ = std dev, n = sample size

14
New cards
<p>Explain the following formula</p>

Explain the following formula

Confidence interval for a proportion.

  • p = sample proportion

  • z = z-score for confidence level

  • n = sample size

15
New cards
<p>Explain the following formula</p>

Explain the following formula

Expected value, general formula.

  • x = each possible outcome

  • P(X=x) = probability of that outcome

16
New cards
<p>Explain the following formula </p>

Explain the following formula

Variance, general formula.

  • E[X²] = expected value of X squared (Σx²·P(X=x))

  • (E[X])² = square of the mean (formula 15, squared)

17
New cards

When is an event mutually exclusive? (Not on formula sheet)

knowt flashcard image
18
New cards

When is an event considered to be independent? (Not on formula sheet)

knowt flashcard image
19
New cards

Confidence interval formula (Not on formula sheet)

knowt flashcard image
20
New cards
Definition and explanation of: Variance
Definition: A measure of how spread out the data is from the mean, calculated as the average of the squared differences from the mean. Explanation: Imagine archers shooting at a target. If every arrow lands close to the bullseye, variance is low. If arrows are scattered all over the board, variance is high. Squaring the distances ensures arrows that miss high and arrows that miss low both count as misses, instead of cancelling out.
21
New cards
Definition and explanation of: Standard Deviation
Definition: The square root of the variance; tells you the average distance of data points from the mean, in the original units. Explanation: Variance is in "squared units" which is hard to interpret. Standard deviation un-squares it so you can say "on average, scores are 8 marks away from the mean" instead of some weird squared number.
22
New cards
Definition and explanation of: Mean
Definition: The arithmetic average of a data set — sum of all values divided by the number of values. Explanation: If everyone in your class pooled their test marks and then split them equally, the mean is what each person would get back.
23
New cards
Definition and explanation of: Median
Definition: The middle value of a data set when arranged in order. Explanation: Line your whole class up from shortest to tallest. The median is whoever's standing exactly in the middle — doesn't care how short the shortest person is or how tall the tallest is, just cares about position.
24
New cards
Definition and explanation of: Mode
Definition: The value that appears most frequently in a data set. Explanation: It's the most popular kid at the party — the value everyone "voted for" the most by showing up as that number.
25
New cards
Definition and explanation of: Quartiles
Definition: Three values (Q1, Q2, Q3) that divide ordered data into four equal parts. Explanation: Cut your line-up of students (ordered shortest to tallest) into 4 equal groups using 3 cuts. Each cut point is a quartile. Q2 is just the median.
26
New cards
Definition and explanation of: Percentiles
Definition: Values that divide ordered data into 100 equal parts. Explanation: Same line-up idea as quartiles, but now you're making 99 cuts instead of 3, giving you much finer detail on where someone stands relative to everyone else.
27
New cards
Definition and explanation of: Range
Definition: The difference between the maximum and minimum values in a data set. Explanation: It's the wingspan of your data — how far the tallest and shortest person in the line-up are from each other, top to bottom.
28
New cards
Definition and explanation of: IQR
Definition: Q3 - Q1; the range of the middle 50% of the data. Explanation: Forget the extremes at either end of the line-up — IQR only measures the spread of the "average" middle chunk of people, ignoring the outliers at the edges.
29
New cards
Definition and explanation of: Semi-IQR
Definition: Half of the IQR, i.e. (Q3-Q1)/2. Explanation: Same idea as IQR, just cut in half so you get an average "radius" of spread around the median instead of the full width.
30
New cards
Definition and explanation of: Outliers
Definition: Data points that fall unusually far from the rest of the data, typically outside Q1 - 1.5×IQR or Q3 + 1.5×IQR. Explanation: It's the one kid at the height line-up who's either a toddler or a giant compared to everyone else — they break the pattern and skew your picture if you're not careful.
31
New cards
Definition and explanation of: Five Number Summary
Definition: Minimum, Q1, Median, Q3, Maximum — five values that summarize a data set's distribution. Explanation: It's a quick "snapshot" of your whole data set using just 5 landmarks, like giving someone 5 GPS pins instead of the entire road map.
32
New cards
Definition and explanation of: Interpolation
Definition: Estimating a value that falls within the range of known data points. Explanation: If you know it was 20°C at noon and 26°C at 2pm, interpolation is guessing the temperature at 1pm — you're filling in a gap inside known territory.
33
New cards
Definition and explanation of: Extrapolation
Definition: Estimating a value that falls outside the range of known data points. Explanation: Same temperature example, but now you're guessing what it'll be at midnight using only that noon-2pm data. You're guessing beyond what you actually know — riskier, less reliable.
34
New cards
Definition and explanation of: Mutually Exclusive
Definition: Two events that cannot occur at the same time (P(A∩B) = 0). Explanation: You can't be both sitting down and standing up at the exact same instant. If one happens, the other is automatically impossible.
35
New cards
Definition and explanation of: Independent Events
Definition: Two events where the occurrence of one does not affect the probability of the other. Explanation: Flipping a coin and rolling a die. The coin landing on heads tells you nothing about what the die will show — they're not eavesdropping on each other.
36
New cards
Definition and explanation of: Permutation
Definition: An arrangement of items where order matters. Explanation: Think of arranging people on a podium — 1st, 2nd, 3rd place. Swapping who's 1st and 2nd makes a completely different outcome, even with the same three people.
37
New cards
Definition and explanation of: Combination
Definition: A selection of items where order does not matter. Explanation: Now think of just choosing 3 people to be on a team, no ranking involved. Whether you picked Tom-then-Sarah-then-Jake or Jake-then-Tom-then-Sarah, it's the same team. Order is irrelevant.
38
New cards
Definition and explanation of: Random Variable
Definition: A variable whose value is determined by the outcome of a random experiment. Explanation: It's a placeholder that "waits" to be filled in by chance — like X = "number of heads in 3 coin flips." You don't know its value until the experiment actually happens.
39
New cards
Definition and explanation of: Point Estimate
Definition: A single value used to estimate an unknown population parameter. Explanation: If you ask "what's the average height of all students at school?" and someone replies with one number, that single number is the point estimate. One dart thrown at the truth.
40
New cards
Definition and explanation of: Interval Estimate
Definition: A range of values, with an associated confidence level, used to estimate an unknown population parameter. Explanation: Instead of one dart, you throw a "net" — "I'm 95% confident the true average height is between 1.65m and 1.75m." Wider net, but more honest about uncertainty.
41
New cards
Definition and explanation of: Discrete Probability Distribution
Definition: A distribution showing probabilities for a random variable that can only take specific, separate (countable) values. Explanation: You can have 2 cars or 3 cars in a parking spot count, never 2.5 cars. Discrete distributions only deal in whole, separate buckets, like the rungs on a ladder rather than a smooth ramp.
42
New cards
Definition and explanation of: Probability Mass Functions
Definition: A function giving the probability that a discrete random variable equals an exact value. Explanation: It answers "what's the chance X is exactly 4?" — like asking the precise odds of rolling exactly a 4 on a die. Makes sense only for discrete (countable) outcomes.
43
New cards
Definition and explanation of: Probability Density Functions
Definition: A function describing the relative likelihood of a continuous random variable taking on a given value, where probability is found via the area under the curve over an interval. Explanation: For continuous variables, asking "what's the chance height is exactly 1.7000000m" is basically zero. So instead you ask "what's the chance height is between 1.65m and 1.75m," and that's the area under the curve in that range.
44
New cards
Definition and explanation of: Continuous Random Variables
Definition: A random variable that can take any value within a given range (infinitely many possible values). Explanation: Height, time, weight — these can be 1.701m or 1.7011m or anything in between. Unlike the ladder rungs of discrete variables, this is a smooth ramp with no gaps.
45
New cards
Definition and explanation of: Binomial Distribution
Definition: The probability distribution of the number of successes in a fixed number of independent trials, each with the same probability of success. Explanation: Flip a coin 10 times and count heads. Same probability every time (50/50), trials don't affect each other, fixed number of attempts (10). That entire setup is "binomial."
46
New cards
Definition and explanation of: Binomial Probability
Definition: The specific probability of getting exactly x successes out of n trials in a binomial setup, calculated using the binomial formula. Explanation: It's the actual number you get when you plug into the binomial distribution's formula — e.g., "the probability of getting exactly 7 heads out of 10 flips."
47
New cards
Definition and explanation of: Hypergeometric Distribution
Definition: The probability distribution of the number of successes in draws from a finite population, without replacement, where the probability of success changes each draw. Explanation: Drawing cards from a deck without putting them back. Pull an ace, and now there are fewer aces left for the next draw — each draw changes the odds for the next one. That "no replacement, changing odds" feature is the giveaway for hypergeometric.
48
New cards
Definition and explanation of: Confidence
Definition: The probability that a given procedure, repeated many times, will produce an interval containing the true population parameter. Explanation: "95% confidence" doesn't mean "95% chance this exact interval is right." It means if you repeated this whole sampling process 100 times and built 100 intervals, about 95 of them would actually capture the true value.
49
New cards
Definition and explanation of: Confidence Interval
Definition: A range of values, calculated from sample data, likely to contain the true population parameter at a stated confidence level. Explanation: It's the actual "net" itself — the specific upper and lower bound numbers you calculate, like (1.65m, 1.75m), that you believe traps the true population value.
50
New cards
Definition and explanation of: Central Limit Theorem
Definition: As sample size increases, the sampling distribution of the sample mean approaches a normal distribution, regardless of the shape of the original population distribution. Explanation: It doesn't matter if your underlying data is lumpy, skewed, or weird-shaped. If you keep taking samples and averaging them, those averages will start forming a clean, symmetric bell curve once your sample size is big enough. Like blending chunky soup long enough — eventually it becomes smooth.
51
New cards
Definition and explanation of: Normal Distribution
Definition: A symmetric, bell-shaped probability distribution where most values cluster around the mean. Explanation: Think of most people's heights — most cluster around average, with fewer and fewer people as you go toward extremely short or extremely tall. That symmetric "hump" shape is the normal distribution.
52
New cards
Definition and explanation of: Standard Normal
Definition: A normal distribution with a mean of 0 and a standard deviation of 1, used as a reference for z-scores. Explanation: It's the "universal converter" version of the normal distribution. Instead of dealing with different means and standard deviations for every dataset, you convert everything into this one standardized scale so you can compare apples to oranges using the same z-table.