Note

0.0(0)

Take a practice test

Chat with Kai

View the linked video

Explore Top Notes

Lecture 1- Photosynthesis

Studied by 3 people

Chapter 7: Interpersonal Writing: E-mail Reply

Studied by 85 people

AP Psych Unit 0 Vocab

Studied by 12 people

GEOL 101: Final Exam Review

Studied by 222 people

Chapter 12: Quantitative Skills and Biostatistics

Studied by 76 people

Spanish Future Tense

Studied by 82 people

Module 10-1 (2023)

Introduction to Bivariate Data

Observing the relationship between two numerical variables (e.g., height and weight).
Aim is to understand how one numerical variable responds to changes in another.

Variables Definitions

Response Variable (Dependent Variable)

A variable that changes in response to the independent variable.
Denoted as y.

Independent Variable (Explanatory Variable)

A variable used to explain changes in the dependent variable.
Denoted as x.
Acts independently to cause differences in the response variable y.

Examples of Variables

Example 1:

Effect of rainfall on crop yield.
- X = amount of rainfall (independent variable).
- Y = crop yield (dependent variable).

Example 2:

Effect of midterm score on final grade.
- X = midterm score (independent variable).
- Y = final grade (dependent variable).

Data Recording and Visualization

Data for two numerical variables should be recorded as pairs (X, Y).
Use scatter plots to visualize these bivariate observations:
- X-axis: Independent variable (x).
- Y-axis: Dependent variable (y).
- Plot data points based on bivariate observations (e.g., (x1, y1), (x2, y2)).

Evaluating Relationships in Scatter Plots

Example: Does schooling affect salary?
- X = years of schooling.
- Y = salary.
Scatter plots show relationships and can indicate differences among age groups by using different symbols for data points.

Examining Scatterplots

1. Direction of Relationship

Positive Association: as X increases, Y also increases.
Negative Association: as X increases, Y decreases.

2. Form of Relationship

Linear: points follow a straight line.
Curvilinear: points follow a curved line.
Clustered data: points are loose and hard to identify a trend.

3. Strength of Relationship

Strong linear relationship: data points closely align with a linear trend.
Moderate linear relationship: data points are somewhat clustered around a trend line.
Weak relationship: data points are scattered with no clear trend.

Outliers

Observations that deviate significantly from overall pattern.
Could mislead interpretations of the relationship.

Correlation Coefficient

Measures strength and direction of linear relationships between two numerical variables.
Denoted by r (or R).
Ranges from -1 to 1:
- r = 1: Perfect positive linear correlation.
- r = -1: Perfect negative linear correlation.
- r = 0: No linear correlation.
Calculated using means and standard deviations: sensitive to outliers.

Interpreting Correlation Coefficient

Positive Correlation (r > 0)

Indicates positive association, where increases in X lead to increases in Y.
- Example: Years of schooling (X) and salary (Y) have r = 0.9941, indicating strong positive linear relationship.

Negative Correlation (r < 0)

Indicates negative association, where increases in X lead to decreases in Y.

Correlation Coefficient Characteristics

Symmetrical: Switching X and Y does not change the r value.
Dimensionless: Has no units, purely a numerical signifier.
Assess strength using the absolute value of r:
- Close to 1 indicates strong relation, close to 0 indicates weak relation.
Correlation does not imply causation:
- Correlation can exist due to lurking variables.

Lurking Variables and Causation

Lurking Variables: Hidden influences that affect both x and y.
Example: Association between years of schooling and salary does not imply causation (factors like experience, company size affect salary).
Correct conclusions require controlling for all lurking variables.

Conclusion

Need scatter plot to confirm linearity before using correlation coefficient.
Strong positive correlation noted between years of schooling and salary.
Important to refrain from assuming causation based solely on correlation.

Note

0.0(0)

Take a practice test

Chat with Kai

View the linked video

Explore Top Notes

Lecture 1- Photosynthesis

Studied by 3 people

Chapter 7: Interpersonal Writing: E-mail Reply

Studied by 85 people

AP Psych Unit 0 Vocab

Studied by 12 people

GEOL 101: Final Exam Review

Studied by 222 people

Chapter 12: Quantitative Skills and Biostatistics

Studied by 76 people

Spanish Future Tense

Studied by 82 people