Notes on Simple and Multiple Linear Regression Models
Understanding Simple Linear Regression
- Definition: A statistical method that models the relationship between a single predictor variable and an outcome variable.
- Context: In this discussion, the focus is on assessing the impact of AI content on revenue and market share of AI companies.
Selecting Outcome and Predictor Variables
- Outcome Variable: Market share of AI companies.
- This indicates the proportion of the market controlled by the AI company, which is a key measure of success.
- Predictor Variables: Revenue increase may also serve as a predictor.
- It was suggested to compare these outcomes:
- Geographically (country-wise)
- Temporally (year-wise)
- Industry-wise
Initial Steps in Regression Analysis
- Initial Model Setup: Start with one predictor variable and one outcome variable.
- Example: Investigate how revenues depend on market share.
- Intuition for Variables: Market share is typically assumed to influence revenue; larger market share usually suggests higher revenues.
- Exploratory Data Analysis (EDA):
- Use scatter plots in R to visualize relationships between variables.
- Identify if relationships appear linear or non-linear.
Handling Non-linear Relationships
- If Non-linear: Engage further analysis.
- Check whether revenue or other predictors (like AI adoption rate) maintain linearity.
- Evaluate which variable has the most appropriate linear relationship to revenues.
Moving Towards Multiple Linear Regression
- Definition: A regression model using multiple predictors to explain the outcome.
- Setup: More than one predictor variable is necessary (only can have up to four in an introductory course).
- Examples of Predictors to Include:
- AI adoption rate
- AI generated content volume
- Human-AI collaboration rate
- Market share
- Consumer trust in AI (provided this variable is defined and understood)
Collaboration in Predictors
- The order of predictors does not affect the outcome; all should be included to assess their collaborative influence on revenue.
Conclusion and Next Steps
- Start working on simple linear regression based on EDA.
- Transition into multiple linear regression by gradually incorporating the defined predictors.
- Further clarification on variables (like consumer trust) may be explored as understanding develops.