Introduction to Variability and Statistical Thinking - Course Context: Practical sessions in Weeks 2 and 3 focus on refreshing skills in using standard deviation (SD) and standard error (SE) to conduct z-tests and t-tests. These tests are foundational for understanding more complex statistical procedures in later modules (Topics 2 and 3). - The Necessity of Variability Knowledge: Understanding why we need to measure variability is essential before beginning statistical testing. # Standard Deviation (SD): Foundation of Statistics - Definition: Standard deviation is a number which tells us by how much – on average – scores in a set will differ from the mean. - Key Functions of SD: - It quantifies the amount of variability, "error," or "noise" in a dataset. - It predicts how much any single score is expected to differ from the dataset mean based on chance/noise. - It provides a benchmark to determine if an observed difference between a single score and the group mean is meaningful or just due to chance. - Interpretation: - A low SD indicates very little variability; the mean is a good representation of the data. - A high SD indicates "noisy" or "spread out" data where scores differ significantly from one another. # Detailed Steps for Calculating Standard Deviation - 1. Deviation Calculation: Work out the deviation of each score from the mean (subtract the mean from the score). - Formula: extDeviation=extScore−extMean - 2. Squaring Deviations: Square each deviation. This ensures all numbers are positive; otherwise, the sum of deviations would always equal zero. - 3. Sum of Squares (SS): Add the squared deviations together. This result is known as the Sum of Squares. - 4. Variance Calculation: - Step 4a (Population): If you have all scores from an entire population, divide the SS by the total number of scores (N). This is the population variance. - Step 4b (Sample): If collecting data from a representative sample, divide the SS by the total number of scores minus one (N−1). This is the sample variance. - 5. Square Root: Take the square root of the variance to calculate the standard deviation (SD). This "undos" the squaring from step 2. # SD for a Sample of Scores and Degrees of Freedom - The Complication: A sample is a subset of a population and will naturally have slightly less variability. Therefore, calculating sample SD using N would always underestimate population variability. - The Adjustment: To compensate, statisticians divide the SS by N−1 instead of N. This makes the computed SD slightly larger, providing a better estimate of the true population variability. - Degrees of Freedom (df): The value (N−1) is formally known as the degrees of freedom. # Application Example: Impulsivity Test - Dataset: 4,1,6,4,5 - Calculation Breakdown: - Mean: (4+1+6+4+5)/5=4 - Deviations from Mean: 0,−3,2,0,1 - Squared Deviations: 0,9,4,0,1 - Sum of Squares (SS): 14 - Variance (Sample): 14/(5−1)=3.5 - Standard Deviation (SD): extSD=ext3.5=1.87 - Interpretation: The impulsivity scores in the sample differ from the mean by an average of 1.87 points. This amount reflects "error variability" or "noise" (chance effects, experimental error, individual differences). - Practice Sets: - Set 1: 12,14,11,10,13 (Ans: SD=1.58) - Set 2: 3,5,4,1,6 (Ans: SD=1.92) - Set 3: 54,61,53,57,50 (Ans: SD=4.18) # Z-Scores and Standardization - Definition: A z-score tells us how far away from the mean any particular score is relative to the variability in the sample (SD). It is a ratio of the difference from the mean over the SD. - Purpose: We standardise scores to make them comparable and to use standardised tables to determine if a score is significantly different from others. - Z-score Formula: z=SDscore−mean - Probability (p-value): z-score tables provide the likelihood (p) of obtaining a specific score by chance. - Significance threshold: If p < .05, the score is significantly different from the mean (unlikely to be chance). # One-Tailed vs. Two-Tailed Tests - One-Tailed Test: Measures the probability of a score falling into one specific side (tail) of the distribution. - Example: A z-score of 1.26 has a one-tailed p=.1038. - Two-Tailed Test: Measures the probability of a score falling into either tail (above or below the mean). - The p-value for a two-tailed test is double that of a one-tailed test. - Example: For z=1.26, the two-tailed p=.2076. This makes significance harder to achieve than in a one-tailed test. # Standard Error (SE) vs. Standard Deviation (SD) - SD (Standard Deviation): Tells us how much, on average, scores in a set differ from the mean of that set. It measures random error variability within a set. - SE (Standard Error): Tells us how much, on average, sample means (M) of a specific size differ from the mean of the larger population (μ). It measures error variability when comparing a sample to a population. - SE Formula: SE=NVariance or SE=NSD - Relationships: SE is directly proportional to population variance and inversely proportional to sample size (N). Increasing N decreases SE. # The Z-Test for a Sample of Scores - Purpose: Used to determine if a sample mean is significantly different from a population mean. - Formula: z=SEsample mean−population mean - Critical Values: - Two-tailed critical value: z=1.96 (corresponds to p=.05). - One-tailed critical value: z=1.64 (corresponds to p=.05). # Logic of Statistical Difference Tests - Null Hypothesis (H0): Always states that any obtained difference is due solely to error variability/chance. - ¨C44C: Statistical Value=Difference expected due to error variabilityObtained Difference - ¨C45C: The resulting value (e.g., z,t,F) represents the number of times greater the obtained difference is compared to the expected error difference under the Null Hypothesis. - ¨C46C: We accept p < .05 as significant. This implies a 5% chance of a Type I error (rejecting the null hypothesis when it is actually true). # Introduction to T-Tests - ¨C47C: z-tests require knowing the population variance. In real-life research, we rarely have this information and must use sample variances as estimates, necessitating the move to t-tests. - ¨C48C: Used when data comes from the same participants (within-subjects) or matched pairs. - ¨C49C: Used when data comes from two separate groups of participants (between-subjects). # Related T-Test Calculation - ¨C50C: Find difference scores (D) for each participant (e.g., ScoreA−ScoreB). - ¨C51C: Calculate the mean of difference scores (Dˉ). - ¨C52C: Calculate the variance of difference scores (VarDiff) using SS/(N−1). - ¨C53C: Calculate Standard Error (SE): SE=NVarDiff - ¨C54C: Calculate t-statistic: t=SEDˉ - ¨C55C: df=N−1 (where N is the number of pairs). # Unrelated T-Test Calculation - ¨C56C: Calculate means (MA,MB) for both separate groups. - ¨C57C: Calculate variances (VarA,VarB) for each group using SS/(n−1). - ¨C58C: Calculate Pooled Standard Error (SE): SE=NAVarA+NBVarB - ¨C59C: Calculate Observed Difference: MA−MB. - ¨C60C: Calculate t-statistic: t=SEMA−MB - ¨C61C: df=(NA−1)+(NB−1) # Statistical Power and Experimental Design - ¨C62C: Reflects the sensitivity of a test to detect when the Null Hypothesis is untrue. - ¨C63C: Generally more powerful than unrelated designs because they eliminate individual differences. In related designs, participants act as their own controls. - ¨C64C: SE for unrelated tests reflects individual differences + random error; SE for related tests reflects only random error. - ¨C65C: Susceptible to carryover effects, fatigue, practice effects, and participants guessing the hypothesis. These are managed through counterbalancing. - ¨C66C: Not subject to carryover effects; higher degrees of freedom. # Reporting T-Test Results - ¨C67C: Include the test type, means (M), standard deviations (SD), degrees of freedom (df), the t-value, and the p-value. - ¨C68C: "…revealed no significant difference… (M = 65, SD = 13.02) compared to the group… (M = 71.60, SD = 11.19), t(8) = 0.86, p > .05." - Related Example: "…revealed that money spent when not hungry (M = 6.05, SD = 1.30) was significantly less than when hungry (M = 7.16, SD = 1.12), t(4) = 4.39, p < .05." # Worked Solutions for Practice Set 1 - Related T-test Results: - Mean Difference score = 2.00 - SSdiff = 2.00 - Vardiff = 0.50 - SE = 0.32 - t-value = 6.25 - df = 4 - Critical t-value (tcrit) = 2.776 - Unrelated T-test Results: - Mean A = 5.00; Mean B = 3.00 (Difference = 2.00) - VarA = 4.00; VarB = 2.50 - Pooled SE = 1.14 - t-value = 1.75 - df = 8 - Critical t-value (tcrit) = 2.306 # Worked Solutions for Practice Set 2 - Related T-test Results: - Mean Difference score = 2.80 - SSdiff = 10.80 - Vardiff = 2.70 - SE = 0.73 - t-value = 3.84 - df = 4 - Critical t-value (tcrit) = 2.776 - Unrelated T-test Results: - Mean A = 15.80; Mean B = 13.00 (Difference = 2.80) - VarA = 6.20; VarB = 2.50 - Pooled SE = 1.319 - t-value = 2.123 - df = 8 - Critical t-value (tcrit) = 2.306