Statistics 2 Lecture 4

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/7

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

8 Terms

1
New cards

Effect sizes in multiple regression

  • Typically, we are also interested in the explanatory power of single predictors in the model.

  • For multiple regression holds: We cannot use b to judge the strength of the partial association between x and y

  • b depends on the scale on which x and y were measured.

  • Solution: Inspect the effect size

  • For multiple regression we have various options:

    • Standardized regression coefficient: b*

    • Squared partial correlation: 𝑟p2

    • Change in explained variation: Δ𝑅2

2
New cards

Standardized regression coefficients (b*)

  • We can scale each of the b coefficients in the multiple regression model using the:

    • SD of the respective predictor (x)

    • SD of the outcome variable (y)

  • Interpretation: The amount of SDs y is expected to change when x increases with 1 SD (controlling for all other predictors in the model)

  • Rules of thumb for interpretation

    0 < negligible < .10 ≤ small < .30 ≤ moderate < .50 ≤ large

  • Thus works the same as in simple regression but:

    • Not simply Pearson’s correlation (r) between x and y

    • That’s because we statistically control for the other predictors in the model

<ul><li><p>We can scale each of the b coefficients in the multiple regression model using the:</p><ul><li><p>SD of the respective predictor (x)</p></li><li><p>SD of the outcome variable (y)</p></li></ul></li><li><p>Interpretation: The amount of SDs y is expected to change when x increases with 1 SD (controlling for all other predictors in the model)</p></li><li><p><span><strong><em>Rules of thumb for interpretation</em></strong></span></p><p><span>0 <em>&lt; </em>negligible <em>&lt; .10 </em>≤ small <em>&lt; .30 </em>≤ moderate <em>&lt; .</em>50 ≤ large</span></p></li><li><p><span>Thus works the same as in simple regression but:</span></p><ul><li><p>Not simply Pearson’s correlation (r) between x and y</p></li><li><p>That’s because we statistically control for the other predictors in the model </p></li></ul></li></ul><p></p>
3
New cards

(Squared) Partial correlation

  • The (squared) partial correlation is defined in terms of the correlations instead of the b coefficient itself.

  • Example:
    In a model with two predictors (x1 and x2).
    The partial correlation between x1 and y, Controlling for x2

  • proportion variation in y uniquely explained by x1 / proportion variation in y not explained by x2

  • Rules of thumb to interpret 𝑟2: 𝑝

    0 < negligible < . 01 ≤ small < .06 ≤ moderate < .14 ≤ large

<ul><li><p>The (squared) partial correlation is defined in terms of the correlations instead of the <em>b </em>coefficient itself.</p></li><li><p><em>Example</em>:<br>In a model with two predictors (x1 and x2).<br>The <strong>partial correlation between x1 and y</strong>, Controlling for x2</p></li><li><p>proportion variation in y uniquely explained by x1 / proportion variation in y not explained by x2</p></li><li><p><strong><em>Rules of thumb to interpret </em></strong>𝑟2<strong><em>: </em></strong>𝑝</p><p>0 <em>&lt; </em>negligible <em>&lt; . 01 </em>≤ small <em>&lt; .06 </em>≤ moderate <em>&lt; .14 </em>≤ large</p></li></ul><p></p>
4
New cards

Squared partial correlation example

Class size explains 0.2% of the differences in academic performance that were not yet explained by the percentage students with free meals.This is a negligible effect.

<p><span><em>Class size explains </em></span><span style="color: rgb(192, 0, 0)"><em>0.2% </em></span><span><em>of the differences in academic performance that were not yet explained by the percentage students with free meals.This is a </em></span><span style="color: rgb(192, 0, 0)"><strong><em>negligible </em></strong></span><span><em>effect.</em></span></p>
5
New cards

R-squared change

  • Effect size ΔR2 is defined as the difference in explained variation when we compare two models.:

  • A complete model: With all predictors

    • Example: y = a + b1×1 + b2×2

  • A Reduced model: Including all predictors, apart from the one for which you want to know the partial effect

    • Example: y = a + b1×1

  • R-squared change: ΔR = Rc - Rr

  • Interpretation: The proportion variation in y uniquely explained by x2.

  • Rules of thumb for interpretation

  • 0 < negligible < .02 ≤ small < .13 ≤ moderate < .26 ≤ large

6
New cards

R-squared change example

Class size explains 0% of the differences in academic performance, above and beyond the differences that were already explained by differences in the percentage of students with free meals.

<p><span style="color: rgb(255, 255, 255)"> </span><span>Class size explains </span><span style="color: rgb(255, 0, 0)">0% </span><span>of the differences in academic performance, above and beyond the differences that were already explained by differences in the percentage of students with free meals.</span></p>
7
New cards

Familiar rules applied to model c and model r

  • Rule 1: When we predict 𝑦 with x1: The prediction equation 𝒚ෞ = 𝒂 + 𝒃𝒙 makes the best prediction.

  • Rule 2: When we predict 𝑦 with x1 and x2: The prediction equation 𝒚ෞ = 𝒂 + 𝒃 𝒙 + 𝒃 𝒙 makes the best prediction.

  • Prediction errors: Is the difference between the observed and the predicted y of a subject:Variation in x2

  • With rule 1: error = 𝑦 − 𝑦ෝ 𝑟2 summarized (over subjects) as 𝑆𝑆𝐸r = ∑ (𝑦 − 𝑦r)2 = SSE in the reduced (H0) model

    • In the Venn-diagram: 𝑺𝑺𝑬𝒓 = 1 + 2

  • VaWith rule 2: error = 𝑦 − 𝑦ෝ summarized (over subjects) as 𝑆𝑆𝐸c ∑ (𝑦 − 𝑦c)2 = SSE in the complete (HA) model

    • In the Venn-diagram: 𝑺𝑺𝑬𝒄 = 1

  • Question we could ask: Does the complete model perform significantly better in predicting y than the reduced model

<ul><li><p><span><strong><em>Rule 1: </em></strong>When we predict 𝑦 with x1<em>: </em>→ <em>The prediction equation </em>𝒚ෞ = 𝒂 + 𝒃𝒙 makes <em>the best prediction. </em></span></p></li><li><p><span><strong><em>Rule 2: </em></strong>When we predict 𝑦 with x1 and x2<em>: </em>→<em>The prediction equation </em>𝒚ෞ = 𝒂 + 𝒃 𝒙 + 𝒃 𝒙 makes <em>the best prediction.</em></span></p></li><li><p><span><strong><em>Prediction errors: </em></strong>Is the difference between the observed and the predicted <em>y </em>of a subject:</span><span style="color: rgb(255, 255, 255)">Variation in x2</span></p></li><li><p><span>→<em>With rule 1: error = </em>𝑦 − 𝑦ෝ 𝑟<strong>2 </strong><em>summarized (over subjects) as </em>𝑆𝑆𝐸r <em>= </em>∑ (𝑦 − 𝑦r)2  </span><em>= </em><strong><em>SSE in the reduced (H0) model</em></strong></p><ul><li><p><strong>In the Venn-diagram: </strong><span style="color: rgb(192, 0, 0)">𝑺𝑺𝑬𝒓 <strong>= 1 + 2 </strong></span></p></li></ul></li><li><p><span style="color: rgb(255, 255, 255)">Va</span>→<em>With rule 2: error = </em>𝑦 − 𝑦ෝ <em>summarized (over subjects) as </em>𝑆𝑆𝐸c ∑ (𝑦 − 𝑦c)2 <em>= </em><strong><em>SSE in the complete (HA) model</em></strong></p><ul><li><p><span><strong>In the Venn-diagram: </strong></span><span style="color: rgb(192, 0, 0)">𝑺𝑺𝑬𝒄 <strong>= 1</strong></span></p></li></ul></li><li><p>Question we could ask: Does the complete model perform significantly better in predicting y than the reduced model</p><p></p></li></ul><p></p>
8
New cards

The F-test for model comparison

  • Variation uniquely explained by additional parameters complete model / df1

  • Variation that remains unexplained / df2

  • By comparing a complete model and a reduced model that differ by one b coefficient

  • We test: H0: 𝛽𝑖 = 0

  • Compares the residual sums of squares (SSE) of:

    • Complete model

    • Reduced model

  • Test the explanatory power of the extra predictors in the complete model.

  • Models can be extended with more parameters as long as the reduced model is a simplified version of the complete model. →The models should be nested.

  • Our example:

  • Complete model: 𝑦ො = 𝑎 + 𝑏1𝑥1 + 𝑏2𝑥2 = a + b1*PFM + b2*CS Reflects the HA that the partial effect of CS, b2 ≠ 0

  • Reducedmodel:𝑦ො= 𝑎 + 𝑏1𝑥1 =a+b1*PFM
    Reflects the H0 that the partial effect of CS, b2 = 0

  • F-test significant? Reject H0, Conclude that the additional parameter is significant: here, b2 ≠ 0.

  • F-test not significant? No evidence to reject the H0

<ul><li><p>Variation uniquely explained by additional parameters complete model / df1</p></li><li><p>Variation that remains unexplained / df2</p></li><li><p><span>By comparing a <strong>complete model </strong>and a <strong>reduced model </strong>that differ by one <em>b coefficient</em></span></p></li><li><p><span>We test: H0: 𝛽𝑖 = 0</span></p></li><li><p><span>Compares the residual sums of squares (SSE) of:</span></p><ul><li><p><span>Complete model</span></p></li><li><p><span>Reduced model</span></p></li></ul></li><li><p><span>Test the explanatory power of the <strong>extra predictors </strong>in the complete model.</span></p></li><li><p><span>Models can be extended with more parameters as long as the <strong>reduced model </strong>is a <strong>simplified version </strong>of the <strong>complete model. </strong>→The models should be <strong>nested.</strong></span></p></li><li><p><span><em>Our example:</em></span></p></li><li><p><span><strong>Complete model: </strong>𝑦ො = 𝑎 + 𝑏1𝑥1 + 𝑏2𝑥2 = a + b1*PFM </span><span style="color: rgb(255, 0, 0)">+ b2*CS </span><span>→<em>Reflects the </em>HA that the partial effect of CS, b2 ≠ 0</span></p></li><li><p><span><strong>Reducedmodel:</strong>𝑦ො= 𝑎 + 𝑏1𝑥1 =a+b1*PFM<br>→<em>Reflects the </em>H0 that the partial effect of CS, b2 = 0</span></p></li><li><p><span><strong>F-test significant</strong>? Reject H0, Conclude that the additional parameter is significant: here, b2 ≠ 0.</span></p></li><li><p><span><strong>F-test not significant</strong>? No evidence to reject the H0</span></p></li></ul><p></p>