Need to Know

Studied by 0 people

0.0(0)

LearnA personalized and smart learning plan

Practice TestTake a test on your terms and definitions

Spaced RepetitionScientifically backed study method

Matching GameHow quick can you match all your cards?

FlashcardsStudy terms and definitions

1 / 48

There's no tags or description

Looks like no one added any tags here yet for you.

49 Terms

What is Bayes' Theorem and when is it used?

a formula that allows you to update the probability of an event based on new evidence. It's used when you know P(A|B) and need to find P(B|A), essentially flipping conditional probabilities to revise your beliefs.

New cards

State the formula for Bayes' Theorem

P(B/A) = P(A/B) * P(B) / P(A)

New cards

What does P(A|B) represent in Bayes' Theorem

The probability of event A occurring given that event B has occurred

New cards

When should you use the Law of Total Probability

When you need to find the total probability of an event that can occur through multiple, mutually exclusive pathways. It sums the probabilities of all possible ways the event can happen

New cards

State the Law of Total Probability formula

P(A) = sumation P(A|B_i) * P(B_i) for all mutually exclusive events B_i

New cards

Explain what a tree diagram is and its purpose

A visual tool that maps out all possible outcomes of events in branches. It helps simplify complex probability problems by clearly displaying conditional probabilities and pathways

New cards

List the steps to draw a tree diagram for a probability scenario

Start from a single point representing the initial event.
Draw branches for each possible outcome of this event.
From each branch, add subsequent branches for additional events.
Label each branch with the corresponding probability.
Multiply probabilities along the branches to find joint probabilities.

New cards

How does Bayes' Theorem relate to tree diagrams

visually represent the conditional probabilities needed, making it easier to identify and compute P(A), P(B), P(A|B), and P(B|A)

New cards

What are mutually exclusive and exhaustive events, and why are they important for the Law of Total Probability

Mutually exclusive events cannot occur simultaneously, and exhaustive events cover all possible outcomes. They ensure that all pathways are accounted for when calculating total probabilities

New cards

What is the Binomial Distribution and when should you use it

models the number of successes in a fixed number of independent Bernoulli trials, each with the same probability of success. It is used when there are two possible outcomes (success or failure) in experiments like coin flips or quality control tests.

New cards

What are the four conditions required for a Binomial Distribution

1) a fixed number of trials, 2) each trial is independent, 3) there are two possible outcomes (success or failure), and 4) the probability of success remains constant across trials.

New cards

What is the general formula for the Binomial Probability of exactly x successes in n trials

The general formula is P(X = x) = (n, x) * p^x * (1-p)^(n-x), where (n, x) is the number of combinations of n trials taken x at a time, p is the probability of success, and (1-p) is the probability of failure.

New cards

In RStudio, which function gives you the exact binomial probability of observing exactly x successes

dbinom(x, size = n, prob = p).

New cards

Which RStudio function computes the cumulative probability up to x successes in a binomial setting

pbinom(x, size = n, prob = p)

New cards

How do you decide between using dbinom() and pbinom() in RStudio

Use dbinom() when you need the probability of exactly x successes. Use pbinom() when you need the probability of x or fewer successes (cumulative probability)

New cards

Given a probability scenario, how do you choose the correct form of the binomial formula to find the desired probability

Exactly x successes: Use the standard binomial formula or dbinom(x, n, p).

At most x successes (≤ x): Use cumulative probability with pbinom(x, n, p).

At least x successes (≥ x): Calculate 1 - pbinom(x - 1, n, p)

New cards

What is the Empirical Rule (68-95-99.7 Rule) in statistics

68% of data falls within ±1 standard deviation from the mean.
95% of data falls within ±2 standard deviations from the mean.
99.7% of data falls within ±3 standard deviations from the mean

New cards

What is a z-score and how do you calculate it

measures how many standard deviations an individual data point (x) is from the mean (μ**). It's calculated as:

z=(x−μ)/σ where σ is the standard deviation of the dataset.

New cards

How do you interpret a z-score

Positive z-score: The data point is above the mean.
Negative z-score: The data point is below the mean.
Magnitude: Indicates how far and in what direction the data point deviates from the mean

New cards

How do you "unstandardize" a z-score to find the original observed value (x)

x=μ+z×σ where μ is the mean and σ is the standard deviation of the dataset.

New cards

What does the RStudio function pnorm(x, μ, σ) compute

It gives the probability that a randomly selected value is less than or equal to x.

New cards

How do you use qnorm(pct, μ, σ) in RStudio

finds the value x such that a given percentage (pct**) of the data falls below x

New cards

What are the steps to calculate and interpret a z-score

Calculate the z-score: z=x−μ/σ
Interpret:
- If z>0z > 0, xx is above the mean.
- If z<0z < 0, xx is below the mean.
- The larger the absolute value of zz, the further xx is from the mean

New cards

What is a "sampling distribution," and what does it represent conceptually

A sampling distribution is the probability distribution of a statistic obtained through repeated sampling from a population. It represents how the statistic varies from sample to sample, illustrating the sampling variability.

New cards

What are the two conditions that ensure the sampling distribution of x bar will be approximately Normal in shape (Only one must be met)

The population distribution is Normal: If the population from which samples are drawn is normally distributed, then the sampling distribution of xˉ\bar{x} will also be Normal, regardless of sample size.
Large sample size (Central Limit Theorem applies): If the sample size nn is large (typically n≥30n \geq 30), the sampling distribution of xˉ\bar{x} will be approximately Normal, even if the population distribution is not Normal

New cards

Given a population mean μ\mu and standard deviation σ\sigma, what are the shape, mean, and standard error of the sampling distribution of x bar for random samples of size n?

Shape: Approximately Normal if the population is Normal or nn is large (due to the CLT).
Mean of the sampling distribution (μxˉ\mu_{\bar{x}}): Equal to the population mean μ\mu.
Standard error σxˉ=σ/ Square Root of n: Equal to the population standard deviation divided by the square root of the sample size

New cards

How can you use a sampling distribution to find the probability that a random sample has a mean within a certain range

You can use the sampling distribution to determine the probability by calculating the z-scores for the sample mean within that range and using the standard normal distribution to find the corresponding probabilities.

New cards

Why does a larger sample size nn lead to a sampling distribution that is more closely Normal

Because as n increases, the influence of individual data points diminishes, and the aggregate effect smooths out irregularities, resulting in a sampling distribution that approaches Normality due to the Central Limit Theorem

New cards

How do you calculate a confidence interval for an unknown population mean using a t critical value and a sample of data

Confidence Interval=x bar±t∗(s/ Square Root of n) where x bar is the sample mean, t∗ is the critical value from the t-distribution, s is the sample standard deviation, and n is the sample size

New cards

How is the t critical value related to the level of confidence of the interval

The t* critical value corresponds to the desired confidence level and degrees of freedom (df = n - 1). A higher confidence level requires a larger t* value, resulting in a wider confidence interval. This ensures that the interval has a higher probability of containing the true population mean.The t critical value increases as the confidence level rises, reflecting the trade-off between confidence and precision in estimating the population mean.

New cards

Why might there be two different t critical values for a 95% confidence interval

Because the t* critical value depends on the degrees of freedom (df), which is based on the sample size (n). Different sample sizes result in different degrees of freedom:

Smaller sample sizes (lower df): t-distribution is wider; larger t* value.
Larger sample sizes (higher df): t-distribution approaches the standard Normal distribution; smaller t* value.

Therefore, for the same confidence level (e.g., 95%), t* varies with df

New cards

How do you interpret a 95% confidence interval

means that we are 95% confident that the true population mean lies within the interval. If we were to take many random samples and compute a confidence interval from each, approximately 95% of those intervals would contain the true population mean

New cards

How does sample size affect the width of the confidence interval

Increasing the sample size decreases the standard error (s/ Square Root of n), which narrows the confidence interval. Conversely, a smaller sample size increases the standard error, resulting in a wider interval. So:

Larger n: Narrower interval
Smaller n: Wider interval

New cards

How does the confidence level affect the width of the confidence interval

Higher confidence level: Wider interval
Lower confidence level: Narrower interval

New cards

How do you write the null and alternative hypotheses for a research question

Null Hypothesis (H0H_0): States that there is no effect or no difference. It's a statement of equality (e.g., μ=μ0\mu = \mu_0).
Alternative Hypothesis (HaH_a): Represents what you're trying to prove—there is an effect or a difference. It can be one-sided (e.g., μ>μ0\mu > \mu_0 or μ<μ0\mu < \mu_0) or two-sided (e.g., μ≠μ0\mu \neq \mu_0).

New cards

What does a test statistic measure conceptually in hypothesis testing

It quantifies the difference between the observed sample statistic and the parameter stated in the null hypothesis, relative to the standard error. Essentially, it measures how many standard errors the sample result is from the null hypothesis value

New cards

What is a p-value, and how do you interpret it

The probability of observing a test statistic at least as extreme as the one calculated, assuming the null hypothesis is true. A small p-value (typically ≤ α) indicates strong evidence against the null hypothesis, leading you to reject H0H_0. A large p-value suggests insufficient evidence to reject H0H_0.

New cards

How is the significance level (α\alpha) of a hypothesis test related to the t critical value

The significance level α\alpha determines the threshold for rejecting H0H_0. The t* critical value corresponds to this α\alpha in the t-distribution with appropriate degrees of freedom. It's the cutoff point beyond which we consider results statistically significant.

New cards

Given a t critical value, how do you decide whether to reject or fail to reject the null hypothesis

Compare the calculated test statistic to the t critical value. If the test statistic exceeds the t critical value, you reject the null hypothesis; if not, you fail to reject it. t=x bar−μ_o/s/ Sqare Root of n

New cards

Why is "failing to reject" the null hypothesis not the same as "accepting" it

Because failing to reject H0H_0 simply means there's not enough evidence against it—it doesn't prove H0H_0 is true. We can never confirm the null hypothesis; we can only gather evidence to reject or not reject it

New cards

What are α\alpha, β\beta, and power in the context of hypothesis testing

α\alpha (Type I Error): The probability of incorrectly rejecting a true null hypothesis (a false positive).
β\beta (Type II Error): The probability of failing to reject a false null hypothesis (a false negative).
Power (1 - β\beta): The probability of correctly rejecting a false null hypothesis—detecting an effect when there is one

New cards

How are α\alpha, β\beta, and power related

Increasing α\alpha: Decreases β\beta and increases power—you’re more likely to detect an effect but also more likely to make a Type I error.
Decreasing α\alpha: Increases β\beta and decreases power—you’re less likely to make a Type I error but more likely to miss detecting an effect.

There's a trade-off between α\alpha and β\beta; adjusting one affects the other, impacting the test's power.

New cards

How does sample size affect α\alpha, β\beta, and power

Larger Sample Size:
- β\beta: Decreases (less likely to miss detecting an effect).
- Power: Increases (more sensitive test).
- α\alpha: Remains unchanged (set by the researcher).

A bigger sample provides more information, reducing variability and making it easier to detect true effects.

New cards

How does effect size influence α\alpha, β\beta, and power

Larger Effect Size:

β\beta: Decreases (easier to detect a true effect).
Power: Increases (higher chance of detecting the effect).
α\alpha: Unaffected directly but a larger effect size makes it easier to achieve significance at a given α\alpha.

New cards

What is the purpose of a one-sample t-test

The one-sample t-test assesses whether the mean of a single sample significantly differs from a known or hypothesized population mean. It helps determine if the observed sample mean is statistically different from the population mean due to chance or reflects a true effect.

New cards

Given a research question and sample summary statistics, how do you conduct a one-sample t-test

State the Hypotheses: Formulate the null and alternative hypotheses.
Calculate the Test Statistic: Use the sample mean, population mean, sample standard deviation, and sample size.
Determine the Critical Value: Based on the significance level (α\alpha) and degrees of freedom (df=n−1df = n - 1).
Compare Test Statistic to Critical Value: Decide whether to reject or fail to reject the null hypothesis.
Draw a Conclusion: Interpret the results in the context of the research question.

New cards

How do you state the null (H0H_0) and alternative (HaH_a) hypotheses for a one-sample t-test

Null Hypothesis (H0H_0): μ=μ0\mu = \mu_0 The population mean equals the hypothesized mean.
Alternative Hypothesis (HaH_a): Could be one of the following based on the research question:
- Two-tailed: μ≠μ0\mu \neq \mu_0 The population mean is not equal to the hypothesized mean.
- Left-tailed: μ<μ0\mu < \mu_0 The population mean is less than the hypothesized mean.
- Right-tailed: μ>μ0\mu > \mu_0 The population mean is greater than the hypothesized mean
  μ\mu: Actual population mean
  μ0\mu_0: Hypothesized population mean

New cards

What is the formula for the test statistic in a one-sample t-test, and what does each symbol represent

The formula for the test statistic in a one-sample t-test is t = (\bar{x} - \mu_0) / (s / \sqrt{n}), where \bar{x} is the sample mean, \mu_0 is the population mean, s is the sample standard deviation, and n is the sample size, t is the test statistic, and u_o is the hypothesized population mean.

New cards

Provide an example of a full conclusion for a one-sample t-test

Based on the sample data, the calculated t-statistic is 2.35 with 24 degrees of freedom. Since the p-value (0.026) is less than the significance level (α=0.05\alpha = 0.05), we reject the null hypothesis. This suggests that the true population mean significantly differs from the hypothesized mean. Therefore, we conclude that [insert context-specific conclusion]

New cards