Statistical Analysis and Interpretation Notes
Key Points on Statistical Analysis and Interpretation
Basic Statistical Concepts
P-Value
A p-value indicates the significance of results in hypothesis testing.
If p-value < 0.05: Statistically significant (not zero); we can reject the null hypothesis.
If p-value > 0.05: Not statistically significant (could be zero); we fail to reject the null hypothesis.
Example:
P-value of 0.38 indicates uncertainty about an estimate of 0.44 (it may vary and could include zero).
An estimate of 2.8 suggests confidence that it's statistically nonzero.
Understanding Regression Outputs
Linear Regression Results
Mention the importance of clearly stating variables in your comparisons (for example, in trampoline stiffness).
Should specify the only difference between the groups being compared (e.g., spring stiffness in trampolines).
Statement format:
"For every one unit increase, the response (e.g., bounce height) increases by X on average."
Emphasize that context matters—ensure clarity for assumptions made (e.g., comparing trampoline stiffness alone).
Coefficient of Determination (R-Squared)
R-Squared (R²)
R² represents the proportion of variance explained by the model.
Found by squaring the correlation coefficient (r):
If R² = 0.5,
50% explained variance—considered acceptable.
Low R² values (e.g., 20%) suggest a poor model.
Making Predictions
Extrapolation vs. Interpolation
Predicting beyond existing data (extrapolation) can lead to unreliable results.
Example given: A minor change in stiffness leading to a significant jump height may seem unrealistic when predicted from a model.
Caution: Very strong outputs from models can indicate extrapolation outside reliable data range, hence predictions should be approached with skepticism.
Graph Interpretation
Annotating Graphs
Essential for clear interpretation: label axes, identify study participants, and the context of the graph.
Understanding Variables
Clarify demographics vs. performance position variables:
Consider the relationships shown in graphs; identify the independent and dependent variables accurately.
Recognize categories within variables (e.g., white women, women of color) when analyzing distribution.
Types of Distributions
Conditional Distribution:
Probability of a demographic within a specific context (e.g., entry-level positions among women of color).
Not a histogram as there are gaps in data; distributions should be continuous without gaps for such classification.
Validity of Assumptions and Statistical Conclusions
Summing Probabilities
Conditional distributions must sum to 100%, and graphs must be interpreted with caution concerning the number of people in different employment hierarchies (entry-level vs. executive roles).
Avoid incorrect arithmetic sum of different categories—conditions might vary significantly from level to level and should be assessed with contextual understanding rather than numerical addition alone.
Final Checklist for Reports
Annotate graphs with contextual backgrounds.
Ensure a thorough analysis of variable differences.
Clearly articulate assumptions and limits of your model and predictions.
Pay attention to detailed formatting when stating findings to maintain clarity and enhance understanding.