Correlation: A statistical measure that expresses the extent to which two variables are linearly related. It indicates both the strength and direction of a relationship between two variables.
2. Important Uses of Correlation
Prediction: Utilized to predict values of one variable based on the values of another.
Testing Validity: Applied to test the validity of theoretical constructs or methods.
Theory Development: Assists in formulating and refining theories based on observed relationships.
3. Scatterplots
Definition: A visual representation showing the relationship between two numerical variables.
Construction:
- Plot paired values as points on the Cartesian plane, where each point represents a pair of values (X, Y).
Interpretation:
- Direction: Can be upward (positive correlation) or downward (negative correlation).
- Form: Identifies whether the relationship is linear (straight line) or nonlinear (curved).
- Strength: Observed by how closely the points cluster around a line (the tighter the cluster, the stronger the correlation).
- Outliers: Points that deviate significantly from the overall pattern, which can influence the correlation.
4. Equation of a Straight Line
Correct Formula: Y=bX+a
- Components:
- Y: Predicted variable (dependent variable).
- X: Predictor variable (independent variable).
- b (slope): Indicates the change in Y for each 1-unit increase in X.
- a (Y-intercept): Value of Y when X equals zero.
5. Types of Relationships
Positive Relationship: Both X and Y increase together.
Negative Relationship: X increases while Y decreases (not necessarily both decreasing).
Perfect Relationship: All points fall exactly on a straight line, indicated when r=+1.00 or r=−1.00.
Imperfect Relationship: Points scatter around a line, represented by values of r that fall between -1 and +1.
6. Correlation Coefficient & Pearson’s r
Correlation Coefficient: A numerical measure indicating the strength and direction of the relationship.
- Pearson’s r: Specifically measures linear relationships and requires interval or ratio data.
- Range: Values between -1.00 and +1.00.
- Computational Formula: r=Σ(X−Mx)2⋅Σ(Y−My)2Σ[(X−Mx)(Y−My)]
7. Second Interpretation of r (r²)
Coefficient of Determination: r2=extproportionofvarianceexplained.
- Example: If r=0.60, then r2=0.36 indicates that 36% of the variance in Y is explained by X.
8. Choosing the Correct Correlation
Key Factors:
- Level of Measurement: Types include nominal, ordinal, interval, and ratio levels.
- Shape of Relationship: Determine if the relationship is linear or nonlinear.
Other Types of Correlation:
- Spearman’s rho (ρ): Applicable for ordinal (ranked) data.
- Phi (φ): Suitable for two dichotomous (binary) variables.
- Point-biserial correlation: Involves one dichotomous and one continuous variable.
9. Range, Outliers, and Correlation
Restricted Range: Limits variability in data, leading to a weaker apparent correlation than exists in the true data.
Outliers: Extreme scores can either inflate or weaken the correlation; it is vital to review scatterplots for outliers.
10. Key Skills for Exams
Interpret r: Understand both the direction and strength of correlation.
Calculate or recognize r and r²: Be comfortable working with correlation coefficients.
Choose the correct correlation type: Differentiate which correlation analysis is appropriate for the data at hand.
Inferential Statistics: Draws conclusions about populations based on sample data, relevant for estimation, prediction, and hypothesis testing.
Descriptive Statistics: Focuses on summarizing and organizing data without generalizing beyond the sample.
Two Goals of Inferential Statistics:
Estimation: Estimating population parameters from sample statistics.
Hypothesis Testing: Evaluating claims about population parameters using sample data.
13. Random Sampling Methods
With Replacement: The selected individual is returned to the population and can be chosen again.
Without Replacement: The selected individual is removed and cannot be chosen again.
14. A Priori vs A Posteriori Probability
A Priori Probability (Theoretical): Based on logical reasoning or established structures, not on experimental data.
- Formula: P=total outcomesfavorable outcomes
A Posteriori Probability (Empirical): Based on actual observed data from trials or experiments.
- Formula: P=total trialsnumber of observed successes
Long-Run Behavior: As the number of trials approaches infinity, a posteriori probability converges toward a priori probability.
15. Addition Rule of Probability
General Formula: For finding the probability that event A or event B occurs. P(AextorB)=P(A)+P(B)−P(AextandB)
Terms:
- P(A), P(B) = probability of respective events.
- P(AextandB) = probability of both events occurring simultaneously.
Mutually Exclusive Events: If events A and B cannot occur simultaneously, then P(AextandB)=0, simplifying the formula to: P(AextorB)=P(A)+P(B).
Exhaustive Events: Set of events that covers all possible outcomes; one of them must occur.
16. Multiplication Rule of Probability
Definition: Used to find the probability that both events A and B occur.
General Formula: P(AextandB)=P(A)imesP(B∣A)
Key Terms:
- P(B∣A): Represents the conditional probability of B given A has occurred.
Types of Events:
- Independent Events: Events where the occurrence of one does not affect the other.
- Dependent Events: Events where the occurrence of one affects the probability of the other.
- Mutually Exclusive Events: Cannot occur together.
17. Continuous Variables & Probability
For continuous variables, probabilities are not calculated for exact values but rather over ranges.
18. t Tests Overview
What is a t Test?: A statistical test used to determine whether there is a significant difference between the means of two groups.
When to Use a t Test:
- Appropriate when calculating probability over a range of values.
- Connection to z-scores: Raw scores can be converted into z-scores to compare how far a value is from the mean.
- z-score formula: z=SD(X−M)
- Probability Finding Steps:
1. Convert X to z-score.
2. Utilize the z-table (normal distribution) to find the probability under the curve.
19. Key Skills for Exams on t Tests
Distinguish between inferential and descriptive statistics.
Apply the addition and multiplication rules correctly.
Identify independent vs dependent events.
Convert raw scores to z-scores.
Interpret probability from the normal distribution.
20. Summary of t Tests
Types of t Tests:
1. One-Sample t Test: compares the sample mean to a known population mean.
2. Independent-Samples t Test: compares the means of two different groups.
3. Dependent-Samples t Test (Paired t Test): compares means from the same participants measured twice.
t Statistic Formula: t=variability of the differencedifference between means
21. Degrees of Freedom (df)
Explanation: Refers to the number of independent values or quantities which can vary in an analysis without breaking any constraints. E.g.,
- Independent t Test: df=n1+n2−2
- One-sample t Test: df=n−1.
More degrees of freedom provide a more accurate estimate of the parameter being analyzed.
It impacts the shape of the t distribution, which mimics the normal curve but is wider until the df increases.
22. Assumptions of t Tests
Random Sampling: Selection of subjects must be random.
Independence of Observations: Each measurement must not influence another.
Normal Distribution: Data should be normally distributed, especially crucial for small sample sizes.
Equal Variances: Certain t-tests require equal variances across compared groups.
23. Effect Size (Cohen’s d)
Definition: A measure of practical significance that indicates the magnitude of difference, not merely statistical significance.
24. Study Guide for Selected Questions
Common Statistical Topics:
Z Tests: Implications, when to apply.
Sampling Distributions: Understand distribution of means vs. individual scores.
Power of Tests: Definitions, implications of sample size, effect size, alpha levels, and one-tailed vs two-tailed tests.
25. Final Summary of Assumptions for z Tests
Assumptions: Random sampling, independence, normal distribution (or large sample size), and known population standard deviation.
26. Additional Insights
Understanding Differences in Distributions: Knowledge of population distributions versus sampling distributions, along with standard deviation and standard error distinctions, is vital for accurate statistical analysis.
Statistical Power and its Components: Recognizing the factors influencing power, such as sample size and effect size, allows for improved anticipation of a test's ability to detect true differences while minimizing error risks.
27. Multiple Choice Questions Review
A review of state-based multiple-choice questions following concept definitions, statistical properties, and applications to ensure a comprehensive grasp of the subject matter.