Simple Linear Regression

Simple Linear Regression

  • Simple linear regression investigates the relationship between two variables.

  • Prone to errors due to computations; show all workings to get marks.

Investigating Relationships

  • Use simple linear regression to investigate relationships between two variables.

  • Examples:

  • Economic growth and firm performance.

    • Does economic growth impact firm performance, or vice versa?

  • Number of stocks invested in and level of risk.

Does investing in more stocks increase or decrease risk?

  • Inflation rate and interest rates.

    • Is the relationship positive or negative?

Linear Relationships

  • Simple linear regression focuses on linear relationships.

  • Non-linear relationships will not be detected by this method.

Key Questions

  1. Is there a relationship between the variables?

    • Use correlation analysis to see if variables move together or in opposite directions.

    • Correlation measures linear relationships.

  2. If there is a relationship, what is it?

    • Is it positive or negative?

    • How does it change over time?

Example: TV Repair Technician

  • Technician charges 15forahousevisitplus15 for a house visit plus20 per hour.

  • Variables:

    • x = number of hours worked (independent variable)

    • y = charge (dependent variable)

  • If the technician spends zero hours, the charge is 15.</p></li><li><p>Ifthetechnicianspendsonehour,thechargeis15.</p></li><li><p>If the technician spends one hour, the charge is15 + $20 = $35.</p></li><li><p>Plottingxvs.yshowsastraightline,confirmingalinearrelationship.</p></li></ul><h4id="2ad1085337454ff1831fd6f4fa4f9e69"datatocid="2ad1085337454ff1831fd6f4fa4f9e69"collapsed="false"seolevelmigrated="true">Independentvs.DependentVariables</h4><ul><li><p>Independentvariable(x):</p><ul><li><p>Alsoknownasexplanatoryvariableorpredictorvariable.</p></li><li><p>Usedtoexplaintheimpactony.</p></li></ul></li><li><p>Dependentvariable(y):</p><ul><li><p>Alsoknownasresponsevariable.</p></li><li><p>Respondstochangesinx.</p></li></ul></li></ul><h4id="1841cc0b19414b5d864a8f647549aa79"datatocid="1841cc0b19414b5d864a8f647549aa79"collapsed="false"seolevelmigrated="true">EquationoftheLine</h4><ul><li><p>Inthisexample:.</p></li><li><p>Plotting x vs. y shows a straight line, confirming a linear relationship.</p></li></ul><h4 id="2ad10853-3745-4ff1-831f-d6f4fa4f9e69" data-toc-id="2ad10853-3745-4ff1-831f-d6f4fa4f9e69" collapsed="false" seolevelmigrated="true">Independent vs. Dependent Variables</h4><ul><li><p>Independent variable (x):</p><ul><li><p>Also known as explanatory variable or predictor variable.</p></li><li><p>Used to explain the impact on y.</p></li></ul></li><li><p>Dependent variable (y):</p><ul><li><p>Also known as response variable.</p></li><li><p>Responds to changes in x.</p></li></ul></li></ul><h4 id="1841cc0b-1941-4b5d-864a-8f647549aa79" data-toc-id="1841cc0b-1941-4b5d-864a-8f647549aa79" collapsed="false" seolevelmigrated="true">Equation of the Line</h4><ul><li><p>In this example:y = 15 + 20x</p></li><li><p></p></li><li><p>15istheyintercept(valueofywhenxiszero).</p></li><li><p>is the y-intercept (value of y when x is zero).</p></li><li><p>20istheslope(chargeperhour).</p></li><li><p>Mostofthetime,wedonthaveasmuchinformationwejusthavethedata.</p></li><li><p>Then,weneedtoextrapolatetheyinterceptandslope.</p></li></ul><h4id="e1c99121f4be4de7b7afd8f015956995"datatocid="e1c99121f4be4de7b7afd8f015956995"collapsed="false"seolevelmigrated="true">SlopeandYIntercept</h4><ul><li><p>Straightlinerelationshipshaveaslopeandayintercept.</p></li><li><p>Slope(gradient)indicateshowmuchthecostincreasesforeachadditionalhour.</p><ul><li><p>Inthiscase,itsis the slope (charge per hour).</p></li><li><p>Most of the time, we don't have as much information -- we just have the data.</p></li><li><p>Then, we need to extrapolate the y-intercept and slope.</p></li></ul><h4 id="e1c99121-f4be-4de7-b7af-d8f015956995" data-toc-id="e1c99121-f4be-4de7-b7af-d8f015956995" collapsed="false" seolevelmigrated="true">Slope and Y-Intercept</h4><ul><li><p>Straight-line relationships have a slope and a y-intercept.</p></li><li><p>Slope (gradient) indicates how much the cost increases for each additional hour.</p><ul><li><p>In this case, it's20 per hour.

  • A positive slope indicates a positive linear relationship.

  • Y-intercept is the value of y when x is zero.

    • In this case, it's 15.</p></li></ul></li></ul><h4id="985df62ce784476581f113a4c7abf494"datatocid="985df62ce784476581f113a4c7abf494"collapsed="false"seolevelmigrated="true">GeneralEquation</h4><ul><li><p>Generalequationofaline:15.</p></li></ul></li></ul><h4 id="985df62c-e784-4765-81f1-13a4c7abf494" data-toc-id="985df62c-e784-4765-81f1-13a4c7abf494" collapsed="false" seolevelmigrated="true">General Equation</h4><ul><li><p>General equation of a line:y = b0 + b1x</p><ul><li><p></p><ul><li><p>b_0istheyintercept.</p></li><li><p>is the y-intercept.</p></li><li><p>b_1istheslopecoefficient.</p></li></ul></li></ul><h4id="324b244231d2421fbd68d9b3633a4ee9"datatocid="324b244231d2421fbd68d9b3633a4ee9"collapsed="false"seolevelmigrated="true">Scattergram(ScatterPlot)</h4><ul><li><p>Plotofthedatapoints(xandyvalues).</p></li><li><p>Ascattergramisjustthedots,donotdrawalineofbestfit!</p></li><li><p>Givesaroughideaoftherelationshipbetweenxandy.</p></li><li><p>Helpsdetermineiftherelationshipispositive,negative,ornonlinear.</p></li></ul><h4id="8cc0d4ada28e4fcf844b4d5d5fb9e282"datatocid="8cc0d4ada28e4fcf844b4d5d5fb9e282"collapsed="false"seolevelmigrated="true">KeyPointsforScattergrams</h4><ul><li><p>Labeltheaxes(xandy).</p></li><li><p>Includethedotsrepresentingthedatapoints.</p></li><li><p>Ifthedotsmoveupward,theresapositiverelationship.</p></li><li><p>Ifthedotsmovedownward,theresanegativerelationship.</p></li><li><p>Ifthedotsarerandomlyscattered,theresnolinearrelationship.</p></li></ul><h4id="dc0bbab6953b4a9794e52a6700768a19"datatocid="dc0bbab6953b4a9794e52a6700768a19"collapsed="false"seolevelmigrated="true">LineofBestFit</h4><ul><li><p>Drawalinethatpassesthroughthemiddleofthedots.</p></li><li><p>Minimizetheerrorsbetweentheobservedvaluesandtheestimatedvaluesfromtheline.</p></li><li><p>Theseerrorsarecalledresidualerrors(e).</p></li></ul><h4id="bdafd94f86f842088ad95dd77d563c1a"datatocid="bdafd94f86f842088ad95dd77d563c1a"collapsed="false"seolevelmigrated="true">ResidualError</h4><ul><li><p>Residualerror(e)=is the slope coefficient.</p></li></ul></li></ul><h4 id="324b2442-31d2-421f-bd68-d9b3633a4ee9" data-toc-id="324b2442-31d2-421f-bd68-d9b3633a4ee9" collapsed="false" seolevelmigrated="true">Scattergram (Scatter Plot)</h4><ul><li><p>Plot of the data points (x and y values).</p></li><li><p>A scattergram is just the dots, do not draw a line of best fit!</p></li><li><p>Gives a rough idea of the relationship between x and y.</p></li><li><p>Helps determine if the relationship is positive, negative, or non-linear.</p></li></ul><h4 id="8cc0d4ad-a28e-4fcf-844b-4d5d5fb9e282" data-toc-id="8cc0d4ad-a28e-4fcf-844b-4d5d5fb9e282" collapsed="false" seolevelmigrated="true">Key Points for Scattergrams</h4><ul><li><p>Label the axes (x and y).</p></li><li><p>Include the dots representing the data points.</p></li><li><p>If the dots move upward, there's a positive relationship.</p></li><li><p>If the dots move downward, there's a negative relationship.</p></li><li><p>If the dots are randomly scattered, there's no linear relationship.</p></li></ul><h4 id="dc0bbab6-953b-4a97-94e5-2a6700768a19" data-toc-id="dc0bbab6-953b-4a97-94e5-2a6700768a19" collapsed="false" seolevelmigrated="true">Line of Best Fit</h4><ul><li><p>Draw a line that passes through the middle of the dots.</p></li><li><p>Minimize the errors between the observed values and the estimated values from the line.</p></li><li><p>These errors are called residual errors (e).</p></li></ul><h4 id="bdafd94f-86f8-4208-8ad9-5dd77d563c1a" data-toc-id="bdafd94f-86f8-4208-8ad9-5dd77d563c1a" collapsed="false" seolevelmigrated="true">Residual Error</h4><ul><li><p>Residual error (e) =y - \hat{y}</p><ul><li><p>yistheobservedvalue.</p></li><li><p></p><ul><li><p>y is the observed value.</p></li><li><p>\hat{y}istheestimatedvalueofy(fromthelineofbestfit).</p></li></ul></li><li><p>Goal:minimizethesumoftheseerrors.</p></li><li><p>Problem:Someerrorsarepositive(abovetheline),andsomearenegative(belowtheline).</p></li><li><p>Positiveandnegativeerrorscancanceleachotherout.</p></li></ul><h4id="1afca1e298c94b7783178a736289c8ed"datatocid="1afca1e298c94b7783178a736289c8ed"collapsed="false"seolevelmigrated="true">MinimizingErrors</h4><ul><li><p>Toavoidcancellation,squaretheerrorsbeforesummingthemup.</p></li><li><p>Minimizethesumofthesquarederrorstofindthelineofbestfit.</p></li></ul><h4id="d3bee64a1b644d57b4081acf71e854a3"datatocid="d3bee64a1b644d57b4081acf71e854a3"collapsed="false"seolevelmigrated="true">SourcesofVariation(Errors)</h4><ul><li><p>Twomainerrors:</p><ol><li><p>ResidualError(SSE):</p><ul><li><p>Errorbetweentheobservedyandtheestimatedis the estimated value of y (from the line of best fit).</p></li></ul></li><li><p>Goal: minimize the sum of these errors.</p></li><li><p>Problem: Some errors are positive (above the line), and some are negative (below the line).</p></li><li><p>Positive and negative errors can cancel each other out.</p></li></ul><h4 id="1afca1e2-98c9-4b77-8317-8a736289c8ed" data-toc-id="1afca1e2-98c9-4b77-8317-8a736289c8ed" collapsed="false" seolevelmigrated="true">Minimizing Errors</h4><ul><li><p>To avoid cancellation, square the errors before summing them up.</p></li><li><p>Minimize the sum of the squared errors to find the line of best fit.</p></li></ul><h4 id="d3bee64a-1b64-4d57-b408-1acf71e854a3" data-toc-id="d3bee64a-1b64-4d57-b408-1acf71e854a3" collapsed="false" seolevelmigrated="true">Sources of Variation (Errors)</h4><ul><li><p>Two main errors:</p><ol><li><p>Residual Error (SSE):</p><ul><li><p>Error between the observed y and the estimated\hat{y}.</p></li><li><p>Minimizethiserror.</p></li></ul></li><li><p>ErrorDuetoRegression(SSR):</p><ul><li><p>Errorbetweenthelineofbestfitandthemeanofy.</p></li></ul></li></ol></li><li><p>SST(TotalSumofSquares)=SSE+SSR</p></li></ul><h4id="313e4cf89f6a4421b01e814460611761"datatocid="313e4cf89f6a4421b01e814460611761"collapsed="false"seolevelmigrated="true">VisualRepresentationofErrors</h4><ul><li><p>SSE=distancefromobserveddatatothelineofbestfit.</p></li><li><p>SSR=distancefromthelineofbestfittothemeanofy.</p></li><li><p>SST=totalvariation.</p></li><li><p>.</p></li><li><p>Minimize this error.</p></li></ul></li><li><p>Error Due to Regression (SSR):</p><ul><li><p>Error between the line of best fit and the mean of y.</p></li></ul></li></ol></li><li><p>SST (Total Sum of Squares) = SSE + SSR</p></li></ul><h4 id="313e4cf8-9f6a-4421-b01e-814460611761" data-toc-id="313e4cf8-9f6a-4421-b01e-814460611761" collapsed="false" seolevelmigrated="true">Visual Representation of Errors</h4><ul><li><p>SSE = distance from observed data to the line of best fit.</p></li><li><p>SSR = distance from the line of best fit to the mean of y.</p></li><li><p>SST = total variation.</p></li><li><p>SST = \sum(y - \bar{y})^2</p></li></ul><h4id="51434936e16a40b086d865ca15f34136"datatocid="51434936e16a40b086d865ca15f34136"collapsed="false"seolevelmigrated="true">OrdinaryLeastSquares(OLS)Regression</h4><ul><li><p>Goal:Findalineofbestfitthatbestrepresentsthelinearrelationshipbetweenxandy.</p></li><li><p>Choosetheslopeandyintercepttominimizethesumofsquarederrors(SSE).</p></li><li><p>Ifthemodelhasalotoferrors,itsnotaccurateforpredictions.</p></li></ul><h4id="3f73cfc4789141119b30fea14d28bc55"datatocid="3f73cfc4789141119b30fea14d28bc55"collapsed="false"seolevelmigrated="true">RegressionLineFormula</h4><ul><li><p></p></li></ul><h4 id="51434936-e16a-40b0-86d8-65ca15f34136" data-toc-id="51434936-e16a-40b0-86d8-65ca15f34136" collapsed="false" seolevelmigrated="true">Ordinary Least Squares (OLS) Regression</h4><ul><li><p>Goal: Find a line of best fit that best represents the linear relationship between x and y.</p></li><li><p>Choose the slope and y-intercept to minimize the sum of squared errors (SSE).</p></li><li><p>If the model has a lot of errors, it's not accurate for predictions.</p></li></ul><h4 id="3f73cfc4-7891-4111-9b30-fea14d28bc55" data-toc-id="3f73cfc4-7891-4111-9b30-fea14d28bc55" collapsed="false" seolevelmigrated="true">Regression Line Formula</h4><ul><li><p>\hat{y} = b0 + b1x</p></li><li><p>Needtofind</p></li><li><p>Need to findb0(yintercept)and(y-intercept) andb1(slopecoefficient).</p></li><li><p>First,calculate(slope coefficient).</p></li><li><p>First, calculateb1,thenuse, then useb1tocalculateto calculateb_0.</p></li></ul><h5id="a074af2867d54cedb8343c180d19746a"datatocid="a074af2867d54cedb8343c180d19746a"collapsed="false"seolevelmigrated="true">FormulaeforfindingtheLineofBestFit</h5><p>.</p></li></ul><h5 id="a074af28-67d5-4ced-b834-3c180d19746a" data-toc-id="a074af28-67d5-4ced-b834-3c180d19746a" collapsed="false" seolevelmigrated="true">Formulae for finding the Line of Best Fit</h5><p>b_1 = \frac{\sum{(x - \bar{x})(y - \bar{y})}}{\sum{(x - \bar{x})^2}}</p><p>Alternateformulaforcalculating</p><p>Alternate formula for calculatingb_1is:</p><p>is:</p><p>b_1 = \frac{n\sum{xy} - \sum{x}\sum{y}}{n\sum{x^2} - (\sum{x})^2}</p><p></p><p>b0 = \bar{y} - b1\bar{x}</p><h4id="2dfa72c7c4b84037818dd74e5e06325e"datatocid="2dfa72c7c4b84037818dd74e5e06325e"collapsed="false"seolevelmigrated="true">Example:MarysAnalysis</h4><ul><li><p>Marywantstofindtherelationshipbetweenyearsofexperience(x)andsalary(y)oftechnicians.</p></li><li><p>Needtofindthelineofbestfit.</p></li><li><p>Interprettheanswer.</p></li><li><p>Questionsmightaskforascattergram(plotthedotsonly).</p></li><li><p>DataGiven:</p></li></ul><tablestyle="minwidth:50px"><colgroup><colstyle="minwidth:25px"><colstyle="minwidth:25px"></colgroup><tbody><tr><thcolspan="1"rowspan="1"style="textalign:left;"><p>Experience(x)</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p>Salary(y)</p></th></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>12</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>29</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>16</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>34</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>20</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>33</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>9</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>27</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>6</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>23</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>20</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>34</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>3</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>19</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>4</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>20</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>7</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>23</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>8</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>24</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>4</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>22</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>13</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>27</p></td></tr></tbody></table><ul><li><p>Inordertosolvethisquestion,weneedtofollowthebelowmentionedstepstoformulatethetable:</p></li></ul><tablestyle="minwidth:200px"><colgroup><colstyle="minwidth:25px"><colstyle="minwidth:25px"><colstyle="minwidth:25px"><colstyle="minwidth:25px"><colstyle="minwidth:25px"><colstyle="minwidth:25px"><colstyle="minwidth:25px"><colstyle="minwidth:25px"></colgroup><tbody><tr><thcolspan="1"rowspan="1"style="textalign:left;"><p>Experience</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p>Salary</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p></p><h4 id="2dfa72c7-c4b8-4037-818d-d74e5e06325e" data-toc-id="2dfa72c7-c4b8-4037-818d-d74e5e06325e" collapsed="false" seolevelmigrated="true">Example: Mary's Analysis</h4><ul><li><p>Mary wants to find the relationship between years of experience (x) and salary (y) of technicians.</p></li><li><p>Need to find the line of best fit.</p></li><li><p>Interpret the answer.</p></li><li><p>Questions might ask for a scattergram (plot the dots only).</p></li><li><p>Data Given:</p></li></ul><table style="min-width: 50px"><colgroup><col style="min-width: 25px"><col style="min-width: 25px"></colgroup><tbody><tr><th colspan="1" rowspan="1" style="text-align:left;"><p>Experience (x)</p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>Salary (y)</p></th></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>12</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>29</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>16</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>34</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>20</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>33</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>9</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>27</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>6</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>23</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>20</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>34</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>3</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>19</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>4</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>20</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>7</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>23</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>8</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>24</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>4</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>22</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>13</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>27</p></td></tr></tbody></table><ul><li><p>In order to solve this question, we need to follow the below mentioned steps to formulate the table:</p></li></ul><table style="min-width: 200px"><colgroup><col style="min-width: 25px"><col style="min-width: 25px"><col style="min-width: 25px"><col style="min-width: 25px"><col style="min-width: 25px"><col style="min-width: 25px"><col style="min-width: 25px"><col style="min-width: 25px"></colgroup><tbody><tr><th colspan="1" rowspan="1" style="text-align:left;"><p>Experience</p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>Salary</p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>x - \bar{x}</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p></p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>y - \bar{y}</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p></p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>(x-\bar{x})^2</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p></p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>x*y</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p></p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>x^2</p></th><thcolspan="1"rowspan="1"style="textalign:left;"><p></p></th><th colspan="1" rowspan="1" style="text-align:left;"><p>y^2</p></th></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>12</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>29</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>16</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>34</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td></tr><tr><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Andsoon..</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td><tdcolspan="1"rowspan="1"style="textalign:left;"><p>Value</p></td></tr></tbody></table><h4id="2f5a486e3f604435989c5b6022a87d8d"datatocid="2f5a486e3f604435989c5b6022a87d8d"collapsed="false"seolevelmigrated="true">StepsToSolve</h4><ol><li><p>Calculate</p></th></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>12</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>29</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>16</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>34</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td></tr><tr><td colspan="1" rowspan="1" style="text-align:left;"><p>And so on..</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td><td colspan="1" rowspan="1" style="text-align:left;"><p>Value</p></td></tr></tbody></table><h4 id="2f5a486e-3f60-4435-989c-5b6022a87d8d" data-toc-id="2f5a486e-3f60-4435-989c-5b6022a87d8d" collapsed="false" seolevelmigrated="true">Steps To Solve</h4><ol><li><p>Calculatex_barusing:using :\frac{\sum{x}}{n}</p></li><li><p>Calculate</p></li><li><p>Calculatey_barusing:using :\frac{\sum{y}}{n}</p></li><li><p>Subsequentlycalculate</p></li><li><p>Subsequently calculatex - xbar,,y - ybar & (x - x_bar)^2</p></li><li><p>Calculate</p></li><li><p>Calculate b1 using:using:b1 = \frac{\sum{(x - \bar{x})(y - \bar{y})}}{\sum{(x - \bar{x})^2}}</p></li><li><p>Caluclate</p></li><li><p>Caluclate b0 using:using:\bar{y} - b1\bar{x}</p></li></ol><h4id="12a9908ef52440dd8a5b300524582e1b"datatocid="12a9908ef52440dd8a5b300524582e1b"collapsed="false"seolevelmigrated="true">LineofBestFit(Solved)</h4><ul><li><p></p></li></ol><h4 id="12a9908e-f524-40dd-8a5b-300524582e1b" data-toc-id="12a9908e-f524-40dd-8a5b-300524582e1b" collapsed="false" seolevelmigrated="true">Line of Best Fit (Solved)</h4><ul><li><p>\hat{y} = b0 + b1x</p></li><li><p></p></li><li><p>\hat{y} = 19.37 + 0.7x</p></li><li><p>Usingtheaboveequation,wecanfindoutthepredictedsalariesfortechnicianswith15and30yearsofexperince.</p></li></ul><h4id="9c2117b1d6144ff8b121511d361da2fd"datatocid="9c2117b1d6144ff8b121511d361da2fd"collapsed="false"seolevelmigrated="true">AccuracyofPredictions</h4><ul><li><p>Predictingfor15yearsismoreaccurate,asthisvalueiswithinthesample.</p></li><li><p>Predictionof30yearsofexperienceisoutsidethesample.Suchpreductionsareknownas<em>outofsample</em>andlesslikelytobeaccurate.</p></li><li><p>Insamplepredictionsaregenerallymorereliablethanoutofsamplepredictions.</p></li></ul><h4id="b9d235de5b1f4418b26383e021a676b0"datatocid="b9d235de5b1f4418b26383e021a676b0"collapsed="false"seolevelmigrated="true">CorrelationCoefficient(r)</h4><ul><li><p>Measuresthestrengthanddirectionofalinearrelationshipbetweentwovariables.</p></li><li><p>Rangesfrom1to+1(nounits).</p><ul><li><p>+1:Perfectpositivelinearrelationship.</p></li><li><p>1:Perfectnegativelinearrelationship.</p></li><li><p>0:Nolinearrelationship.</p></li></ul></li><li><p>Valuescloseto+1or1indicateastrongrelationship.</p></li></ul><h5id="93a44a052afd41f19b869c24ba700351"datatocid="93a44a052afd41f19b869c24ba700351"collapsed="false"seolevelmigrated="true">FormulaforFindingtheCorrelationCoeffiecient</h5><p></p></li><li><p>Using the above equation, we can find out the predicted salaries for technicians with 15 and 30 years of experince.</p></li></ul><h4 id="9c2117b1-d614-4ff8-b121-511d361da2fd" data-toc-id="9c2117b1-d614-4ff8-b121-511d361da2fd" collapsed="false" seolevelmigrated="true">Accuracy of Predictions</h4><ul><li><p>Predicting for 15 years is more accurate, as this value is within the sample.</p></li><li><p>Prediction of 30 years of experience is outside the sample. Such preductions are known as <em>out of sample</em> and less likely to be accurate.</p></li><li><p>In-sample predictions are generally more reliable than out-of-sample predictions.</p></li></ul><h4 id="b9d235de-5b1f-4418-b263-83e021a676b0" data-toc-id="b9d235de-5b1f-4418-b263-83e021a676b0" collapsed="false" seolevelmigrated="true">Correlation Coefficient (r)</h4><ul><li><p>Measures the strength and direction of a linear relationship between two variables.</p></li><li><p>Ranges from -1 to +1 (no units).</p><ul><li><p>+1: Perfect positive linear relationship.</p></li><li><p>-1: Perfect negative linear relationship.</p></li><li><p>0: No linear relationship.</p></li></ul></li><li><p>Values close to +1 or -1 indicate a strong relationship.</p></li></ul><h5 id="93a44a05-2afd-41f1-9b86-9c24ba700351" data-toc-id="93a44a05-2afd-41f1-9b86-9c24ba700351" collapsed="false" seolevelmigrated="true">Formula for Finding the Correlation Coeffiecient</h5><p>r = \frac{n\sum{xy} - \sum{x}\sum{y}}{\sqrt{[n\sum{x^2}-(\sum{x})^2][n\sum{y^2} - (\sum{y})^2]}}</p><ul><li><p>Forthisformulayouwillneedtoaugmenttheinitialdatatableusingthefollowingstepsandcolumns:</p></li></ul><ol><li><p>Implementthetableshownin</p></li></ol><p></p><p>Iapologize,buttheprovidedtextdoesnotcontainadefinitionorexplanationofwhatthecoefficientofdeterminationis.Thisstatisticalmeasure,oftendenotedas</p><ul><li><p>For this formula you will need to augment the initial data table using the following steps and columns:</p></li></ul><ol><li><p>Implement the table shown in</p></li></ol><p></p><p>I apologize, but the provided text does not contain a definition or explanation of what the coefficient of determination is. This statistical measure, often denoted asR^2,indicatestheproportionofthevarianceinthedependentvariablethatispredictablefromtheindependentvariable(s).Insimplerterms,itexplainshowwelltheregressionmodelfitstheobserveddata.Ahigher, indicates the proportion of the variance in the dependent variable that is predictable from the independent variable(s). In simpler terms, it explains how well the regression model fits the observed data. A higherR^2$$ suggests a better fit.