Statistical Analysis and Interpretation Notes

P-Value
- A p-value indicates the significance of results in hypothesis testing.
- If p-value < 0.05: Statistically significant (not zero); we can reject the null hypothesis.
- If p-value > 0.05: Not statistically significant (could be zero); we fail to reject the null hypothesis.
- Example:
- P-value of 0.38 indicates uncertainty about an estimate of 0.44 (it may vary and could include zero).
- An estimate of 2.8 suggests confidence that it's statistically nonzero.

Linear Regression Results
- Mention the importance of clearly stating variables in your comparisons (for example, in trampoline stiffness).
- Should specify the only difference between the groups being compared (e.g., spring stiffness in trampolines).
- Statement format:
- "For every one unit increase, the response (e.g., bounce height) increases by X on average."
- Emphasize that context matters—ensure clarity for assumptions made (e.g., comparing trampoline stiffness alone).

R-Squared (R²)
- R² represents the proportion of variance explained by the model.
- Found by squaring the correlation coefficient (r): $R^2 = r^2$
- If R² = 0.5,
- 50% explained variance—considered acceptable.
- Low R² values (e.g., 20%) suggest a poor model.

Extrapolation vs. Interpolation
- Predicting beyond existing data (extrapolation) can lead to unreliable results.
- Example given: A minor change in stiffness leading to a significant jump height may seem unrealistic when predicted from a model.
- Caution: Very strong outputs from models can indicate extrapolation outside reliable data range, hence predictions should be approached with skepticism.

Annotating Graphs
- Essential for clear interpretation: label axes, identify study participants, and the context of the graph.
Understanding Variables
- Clarify demographics vs. performance position variables:
- Consider the relationships shown in graphs; identify the independent and dependent variables accurately.
- Recognize categories within variables (e.g., white women, women of color) when analyzing distribution.
Types of Distributions
- Conditional Distribution:
- Probability of a demographic within a specific context (e.g., entry-level positions among women of color).
- Not a histogram as there are gaps in data; distributions should be continuous without gaps for such classification.

Summing Probabilities
- Conditional distributions must sum to 100%, and graphs must be interpreted with caution concerning the number of people in different employment hierarchies (entry-level vs. executive roles).
- Avoid incorrect arithmetic sum of different categories—conditions might vary significantly from level to level and should be assessed with contextual understanding rather than numerical addition alone.

Annotate graphs with contextual backgrounds.
Ensure a thorough analysis of variable differences.
Clearly articulate assumptions and limits of your model and predictions.
Pay attention to detailed formatting when stating findings to maintain clarity and enhance understanding.