Stats

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/52

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

53 Terms

1
New cards

Actual values of the response variable

y

2
New cards

Predicted value of the response variable

y-hat; ŷ

3
New cards

Residual

e = y - ŷ; represents the difference between actual and predicted values.

4
New cards

Least Squares Regression Line (LSRL)

A regression line that minimizes the squared residuals.

5
New cards

Represents the fraction of the variation in the response variable explained by the regression line, with values between 0 and +1.

6
New cards

Homogeneity of variance

Condition where residuals have similar spread; violations appear when residuals spread out.

7
New cards

Standard Error

Summarizes the typical size of residuals, giving a rough estimate of the model's accuracy.

8
New cards

Law of Large Numbers

The long-run relative frequency of repeated independent events approaches the true relative frequency as the number of trials increases.

9
New cards

Conditional Probability

The probability of an event given the occurrence of another event, represented as P(B | A).

10
New cards

Binomial Model

Appropriate for a random variable that counts the number of successes in a fixed number of Bernoulli Trials.

11
New cards

Complement Rule

P(Aᶜ) = 1 - P(A); describes the probability of the complement event.

12
New cards

Outcome

The value measured, observed, or reported for a trial in a random phenomenon.

13
New cards

Event

A collection of outcomes from a random phenomenon.

14
New cards

Random Variable

A variable whose value depends on a random event, denoted by X.

15
New cards

Expected Value

Theoretical long-run average of a random variable, denoted by E(X) or μ.

16
New cards

Outlier

A data point with a large residual or high leverage.

17
New cards

Influential Point

A data point that, when omitted, causes a significant change in the slope of the regression model.

18
New cards

Formula for Least Squares Regression Line (LSRL)

The LSRL can be expressed as ŷ = b₀ + b₁x, where b₀ is the y-intercept and b₁ is the slope.

19
New cards

Formula for R²

R² = 1 - (SS_res / SS_tot), where SS_res is the sum of squared residuals and SS_tot is the total sum of squares.

20
New cards

Formula for Expected Value

E(X) = Σ [x * P(x)], where x represents the outcomes and P(x) their probabilities.

21
New cards

Standard Error of the Estimate

SE = √(Σe² / (n - 2)), where e are residuals and n is the number of data points.

22
New cards

Formula for Conditional Probability

P(A | B) = P(A ∩ B) / P(B), where P(A ∩ B) is the probability of both A and B occurring.

23
New cards

Variance of a Random Variable

Var(X) = E(X²) - [E(X)]², indicating the spread of the random variable relative to its mean.

24
New cards

Standard Deviation

A measure of the dispersion or spread of a set of values, denoted as σ (for population) or s (for sample).

25
New cards

Formula for Standard Deviation (Population)

σ = √(Σ(x - μ)² / N), where μ is the population mean and N is the number of data points.

26
New cards

Formula for Standard Deviation (Sample)

s = √(Σ(x - x̄)² / (n - 1)), where x̄ is the sample mean and n is the number of data points.

27
New cards

Sampling Distribution

The probability distribution of a statistic obtained through a large number of samples drawn from a specific population.

28
New cards

Central Limit Theorem

States that the distribution of the sample means will approach a normal distribution as the sample size increases, regardless of the shape of the population distribution.

29
New cards

Z-Score

The number of standard deviations a data point is from the mean, calculated as Z = (X - μ) / σ.

30
New cards

Confidence Interval

A range of values, derived from the sample statistics, that is likely to contain the value of an unknown population parameter.

31
New cards

Formula for Confidence Interval for Mean

CI = x̄ ± Z*(σ/√n), where Z* is the Z-score corresponding to the desired confidence level, σ is the standard deviation, and n is the sample size.

32
New cards

Margin of Error

The amount of error that is allowed in the sample; calculated as E = Z*(σ/√n) for confidence intervals.

33
New cards

(b₁) Slope of the Regression Line

Indicates the expected change in the response variable for a one-unit change in the predictor variable.

34
New cards

Simple Linear Regression

A statistical method that models the relationship between a dependent variable and one independent variable by fitting a linear equation to observed data.

35
New cards

Multiple Linear Regression

A statistical technique that uses multiple independent variables to predict the value of a dependent variable.

36
New cards

Goodness of Fit

A measure of how well the observed outcomes match the expected outcomes in a model.

37
New cards

Overfitting

A modeling error that occurs when a model is too complex, capturing noise instead of the underlying data pattern.

38
New cards

Underfitting

A modeling error that occurs when a model is too simple to capture the underlying trend of the data.

39
New cards

Regression Coefficient

Represents the change in the dependent variable for a one-unit change in the predictor variable.

40
New cards

Interaction Term

A variable in a regression model that accounts for the effect of two or more independent variables acting together on the dependent variable.

41
New cards

Cross-Validation

A technique for assessing how the results of a statistical analysis will generalize to an independent data set.

42
New cards

Model Specification

The process of developing a model that accurately represents the relationship between the variables.

43
New cards

Residual Analysis

The examination of residuals to assess the goodness of fit of a model.

44
New cards

Bernoulli Trial

A random experiment where there are only two possible outcomes: success or failure.

45
New cards

Success Probability (p)

The probability of success on a single Bernoulli trial in a binomial experiment.

46
New cards

Binomial Distribution

The probability distribution of the number of successes in a fixed number of independent Bernoulli trials.

47
New cards

Formula for Binomial Probability

P(X = k) = (n choose k) * p^k * (1-p)^(n-k), where n is the number of trials and k is the number of successes.

48
New cards

Mean of Binomial Distribution

The expected number of successes in a binomial distribution, calculated as μ = n * p.

49
New cards

Variance of Binomial Distribution

The spread of a binomial distribution, calculated as Var(X) = n * p * (1 - p).

50
New cards

Cumulative Binomial Probability

The probability that the number of successes is less than or equal to a certain number k.

51
New cards

Sampling without Replacement

A sampling method where selected individuals are not returned to the population for subsequent trials.

52
New cards

Sampling with Replacement

A sampling method where selected individuals are returned to the population for subsequent trials.

53
New cards

Normal Approximation to the Binomial

When n is large and p is not too close to 0 or 1, the binomial distribution can be approximated by a normal distribution.