Untitled Flashcards Set

Notation
- Actual values of the response variable: y
- Predicted value of the response variable: y-hat; ŷ
Residual
- Positive is point is above the line
- Negative if point is below the line
- e=y-y-hat
- What would you get if you added up all the residuals from the scatterplot
  - Zero
Example
- 220lbs
- -20lbs
- Y, e
How do we choose where the regression line goes
- Regression line minimizes squared residuals
- Least Squares Regression Line (LSRL) or line of best fit
Line of best fit formula
- y=bo+b1x
Slope Formula
- b1=rsysx
How well does the regression line fit the data?
- R2
  - Values are between 0 and +1
  - Represents the fraction of the variation(specifically the variance) in the response variable that is explained by the regression line
    - R2 close to 1 indicated the model explains a lot
  - R2=(correlation coefficient[r])2
Practice
- Predicted value
- 40 units are explained
- R2=40/50=0.80
- r=(0.80)1/2=0.89
Assumptions for regression
- Quantitative variable condition
- Straight enough condition
- No outliers condition
- “Does the Plot Thicken”
  - Residuals must have similar spread
  - Most common violation when residuals get more spread out
  - (P. 187)
  - Can check using a residual plot, plotting the residuals on the y-axis and the explanatory variable on the x-axis
homoscedasticity/heteroskedasticity
Regression models are appropriate only when they capture and underlying relationship
- Nothing interesting would be left behind
- Residuals incorporate everything that is left behind
- This means that the residuals should not be interesting
- Plotting the residuals against the explanatory variable should show no relationship
- (from p. 181)
Standard Error:
- Summarizes typical residual size
- Rough estimate of how much the model is “off” by
R2 revisited
- R2 tells us the proportion of variation on the response variable that is explained by the explanatory variable
  - “Signal”
- The leftover unexplained variation is summarized by the residuals
  - “Noise”
- Total variance of the response variable = variance coming from the predicted response variable (from the regression model) + variance coming from the residuals
Regression to the mean: when a sample is extreme, the next sample is likely to be closer to the mean
“I trust Spike more than me”

Joe Walch, 2024

R2: The percentage of the variation in the response variable that is explained by the explanatory variable
Total Variance= Unexplained Variance + Explained Variance
To test whether the conditions for a regression are met, use a residual plot
- Should see no patterns on the residual plot
Shifting, rescaling and standardizing variables will not change correlation coefficient , but it will change slope and intercept
Outliers, leverage and Influence
- Outliers:
  - Large residuals
  - High leverage
- Leverage:
  - Data points that are far from the mean
  - Will pull the line closer to themselves, making the residual deceptively small
- Influential Point
  - If omitting a data point results in a model with a very different slope, than the point is influential
Lurking variables can lead to spurious associations
Regressions and causations
- Regressions do not show causation
- Be careful about lurking variables
- Be careful when interpreting slopes

INTRO TO PROBABILITY

Random Phenomena:
- Situation where we know which outcomes could happen, but do not know which particular outcomes would happen
- E.G. Coin Flip, drawing cards
Trial:
- A single attempt of a random phenomenon
- E.G. A single coin flip
Outcome:
- Value that is measured, observed, or reported for a trial
Event:
- A collection of outcomes
- Denoted with bold capital letters
- E.G. flipping 2 coins and recording the outcomes
  - Getting a heads and heads in one event
Sample Space:
- Collection of all possible outcomes
- Denoted with S={...}
- E.G. flipping 2 coins
  - S = {HH, HT, TH, TT}
What is the sample space for flipping 3 coins?
- S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
Law of large numbers:
- The long-run relative frequency of repeated independent events get closer and closer to the true relative frequency as the number of trials increases
- LLN
- Sometimes mistakenly referred to as the “Law of Averages” which doesn’t exist
  - Gambler’s Fallacy
- LLN only works over that long run; doesn’t say anything about the short run
  - “The house always wins”
Probability:
- Long run relative frequency if an event’s occurrence
  - Represented by a number between 1 and 0
  - Typically in decimal or fraction form
- To denote the probability of event a occurring P(A)
  - If P(A)=1, than A will occur
  - If P(A)=0, than A will never occur
  - If P(A)=0.5, than A will occur half of the time over the long run
Independance:
- Two events are independent if learning that one event occurs does not change the probability of the other event occurring
A fan might say that they are 40% sure that their team will win the game. Is that the same type of probability that we have been discussing
- Subjective probability vs. Theoretical probability
  - Theoretical:
    - When a probability is based on a mathematical model
      - Fair coin toss/dice roll, shuffled deck of cards
  - Subjective:
    - Probability that represents someone’s personal degree of belief
      - “I’m 90% sure we will win the game”
TREE DIAGRAM:
5 probability rules
- Probability must be between 0 and 1
- Probability Assignment rule
  - All probabilities must add up to 1
- Complement rule
  - P(AC)=1-P(A)
  - Complement:
    - everything that is not in A is the complement of A
- Addition rule
  - For two disjoint events A and B, the probability of that one or the other occurs is the sum of the two probabilities
5. 15.38461538%
6. Addition Rule(?)
N.E.I.
Addition Rule:
- For two disjoint events A and B, the probability that one or the other occurs
Disjoint Events:
- Events that have no outcomes in common
General Addition Rule:
- More flexible than addition rule
- Used when events are not disjointed
- Formal equation:
  - P(A ∪ B) = P(A) + P(B) – P(A ⋂ B)
Conditional Probability:
- The probability of an event given the occurrence of another event
- Probability applied to a conditional distribution
- P(B | A)=P(A∩B)/P(A)
  - Probability of B, conditioned on A
    - A “given” B
- independent when
  - P(B | A)=P(B)
    - ஃ A & B are independent
Venn Diagram
- Uses both a rectangle and some circles
General Product Rule
- P(A⋂B)=P(A) * P(B | A)
→Disjoint/independent events are required to use simple addition/multiplication rule
Random Variable:
- Variable whose value depends on a random event
- Denoted by ‘X’
- Values are denoted by ‘x’
- E.G. coin flips, dice rolls, card draws, etc.
Probability Model:
- Function that associates a probability with each value of a discrete random variable
- Typically in a table form with at least 3 columns
Expected Value:
- Theoretical long run average of a random variable
- Center of a probability model for the random variable(like the mean)
- Denoted by E(X) or μ
- Calculated by the sum of the products of variable values and probabilities
- Analogous to the ”break even” point or house edge
Random:
- An outcome is random if we know the possible outcomes but not which value it actually takes
- Random outcomes are free of human influence
- Don’t use “random” in place of “unexpected”
- Examples
  - “Random” phone call
  - “Random” actions
Simulation: Using random numbers to represent the outcomes of uncertain events
Trial:
- In a simulation, the sequence of events that we are pretending will take place
- For each trial, we get a simulated answer to our question(simulated outcome)
DISCRETE VS CONtINUOUS
- D - finite number
- C - any within interval
Bernoulli Trials:
- Collection of trials where trial:
  - Each has exactly two outcomes: “success” or “failure”
    - q: success
    - p: failure
  - P(“success”) is constant
  - All trials are independent
Geometric Probability Model:
- Used with random variables that count the number of Bernoulli trials until our first success
- X = the number of trials until the first success
- p = the probability of success
- q = the probability of failure
  - q=1-p
- p and q are compliments
- P(X=x)=qx-1p
- E(X)==1p
  - Note→on the ap exam, 1-p will be shown instead of q
- Var(X)=qp2
- Standard Deviation ==qp2
10% Condition
- Remember that one of the requirements for Bernouli trials is independence, and trials are not independent when we sample without replacement
- However, it is still ok to use this model as long as we randomly sample less than 10% of the population
Binomial Model:
- Appropriate for a random variable that counts the number of successes in a fixed number of Bernoulli Trials
- Example: getting 2 heads with 4 coin flips
- Probability of getting x successes in n trials
- Details:
  - x → number of successes
  - n → number of trials
  - p → probability of success (1-q)
  - q → probability of failure (1-p)
  - P(x)=n!x!(n-x)!pxqn-x
  - Var(X)=npq
  - SD(X)=npq
Systematic Sample
- SImple Random Sample, SMS is the gold standard, but not often the most practical
  - Systematic Sample -
    - Still has randomness, but each is not equally likely
  - Stratified Random Sample
    - Population divided into several subpopulations
    - SRS within each strata
    - Used by differences in the subgroups and want to capture those differences proportionally
  - Cluster Sample
    - Population is divided into groups or clusters
    - Each cluster is similar to other clusters
    - Done for convenience, practicality and/or cost
  - Multistage Sampling
    - Combo of multiple methods (usually Stratified and Cluster)
    - EG
      - For Kauai, we can stratify by moku, than cluster by neighborhood or city block
Surveys:
- How are you asking your questions?
- Specific questions
- Careful with phrasing
- See p. 290-291
- Pilot Survey:
  - Small Trial run of a Survey to test whether the questions and setup are good and clear
What can go wrong?
- Voluntary response sample:
  - A large group is invited to respond and anyone who chooses to respond are counted
  - Leads to a Voluntary Response Bias:
    - Example: Very strongly opinionated people might be more likely to volunteer
- Convenience sample:
  - D
- Bad Sampling Coverage:
  - If the sampling frame excludes people from the population
- Undercoverage:
  - Minorities during the census
- Nonresponse Bias: bias introduced when a large fraction of those sampled fails to respond to a survey
- Response Bias: Anything in a survey that influences responses (like leading questions or unclear phrasing)
The Success/Failure Condition:
- A binomial model is approximately normal if we expect at least 10 successes and 10 failures
  - np10
  - nq10
Discrete vs Continuous models
- Normal Distribution is continuous
- Binomial model is discrete
Statistical Significance: The results of a study are considered statistically significant is there is a very low probability that they happened by chance
- Are the results extreme enough to reject a hypothesis?
Sampling Distribution
- Distribution of sample means

Complement Rule:
- P(AC)=1-P(A)
- Complement:
  - everything that is not in A is the complement of A
Addition Rule:
- For two disjoint events A and B, the probability of that one or the other occurs is the sum of the two probabilities
Disjoint Events:
- Events that have no outcomes in common
General Addition Rule:
- More flexible than addition rule
- Used when events are not disjointed
- Formal equation:
  - P(A ∪ B) = P(A) + P(B) – P(A ⋂ B)
Conditional Probability:
- The probability of an event given the occurrence of another event
- Probability applied to a conditional distribution
- P(B | A)=P(A∩B)/P(A)
  - Probability of B, conditioned on A
    - A “given” B
- independent when
  - P(B | A)=P(B)
    - ஃ A & B are independent
Shifting data affects center but not spread
- E(X+C)=E(X)+C
- E(X±Y)=E(X)±E(Y)
- SD(X+C)=SD(X) (same standard deviation)
- Var(X+C)=Var(X) (same variance)
- Var(X±Y)=Var(X)+Var(Y)
Rescaling data affects center and spread
- E(X*C)=E(X)*C
- SD(X*C)=SD(X)*C
- Var(X*C)=Var(X)*C^2This relationship demonstrates how variance scales with the square of the constant factor, indicating that as we multiply a random variable by a constant, the variability increases in proportion to the square of that constant.