Simple Linear Regressions

This page discusses how researchers and businesses use indicators—measurable factors like population shifts or political climates—to predict what might happen in the future (like sales).

To make these predictions accurately, the text highlights three main concepts: Logical Conditions, Correlation (r), and the Coefficient of Determination (r^2).

1. The Three Rules for a Good Indicator

Before you can trust a piece of data to predict the future, it must meet three tests:

* Logical Connection: There must be a common-sense reason why one thing affects the other.

* Timing: The "indicator" must happen before the result. If you want to predict rain, you look at clouds first.

* Strong Relationship: There needs to be a consistent statistical pattern between the two.

2. Correlation (r): "How closely do they move together?"

The correlation coefficient, denoted as r, tells you the strength and direction of a relationship on a scale from -1.00 to +1.00.

| Value | Meaning | Description |

|---|---|---|

| +1.00 | Perfect Positive | As one goes up, the other goes up exactly in sync. |

| 0.00 | No Relationship | The movements are random; one tells you nothing about the other. |

| -1.00 | Perfect Negative | As one goes up, the other goes down exactly in sync. |

The Formula:

The image shows the standard formula for calculating this relationship:

3. The Coefficient of Determination (r^2): "How much can I trust it?"

While r tells you the direction, r^2 tells you the reliability. It represents the percentage of the change in y that is actually explained by x.

* High Value (0.80 or higher): This is a great predictor. Most of what is happening is explained by your indicator.

* Moderate Value (0.25 to 0.80): It’s a decent predictor, but other factors are also at play.

* Low Value (0.25 or less): This is a poor predictor. You shouldn't rely on this indicator to make decisions.

Summary Example

Imagine you are predicting ice cream sales based on the temperature.

* Logic: People buy cold treats when it's hot (makes sense).

* r (Correlation): You find r = +0.90. This means sales and heat move up together very closely.

* r^2: 0.90 \times 0.90 = 0.81. This means 81% of the change in sales is caused by the heat. The other 19% might be due to things like holidays or local events.

Would you like me to walk through a practice calculation using the formula provided in the image?