Variability, Z-Tests, and T-Tests

Introduction to Variability and Statistical Thinking - Course Context: Practical sessions in Weeks 2 and 3 focus on refreshing skills in using standard deviation (SDSD) and standard error (SESE) to conduct zz-tests and tt-tests. These tests are foundational for understanding more complex statistical procedures in later modules (Topics 2 and 3). - The Necessity of Variability Knowledge: Understanding why we need to measure variability is essential before beginning statistical testing. # Standard Deviation (SD): Foundation of Statistics - Definition: Standard deviation is a number which tells us by how much – on average – scores in a set will differ from the mean. - Key Functions of SD: - It quantifies the amount of variability, "error," or "noise" in a dataset. - It predicts how much any single score is expected to differ from the dataset mean based on chance/noise. - It provides a benchmark to determine if an observed difference between a single score and the group mean is meaningful or just due to chance. - Interpretation: - A low SD indicates very little variability; the mean is a good representation of the data. - A high SD indicates "noisy" or "spread out" data where scores differ significantly from one another. # Detailed Steps for Calculating Standard Deviation - 1. Deviation Calculation: Work out the deviation of each score from the mean (subtract the mean from the score). - Formula: extDeviation=extScoreextMeanext{Deviation} = ext{Score} - ext{Mean} - 2. Squaring Deviations: Square each deviation. This ensures all numbers are positive; otherwise, the sum of deviations would always equal zero. - 3. Sum of Squares (SS): Add the squared deviations together. This result is known as the Sum of Squares. - 4. Variance Calculation: - Step 4a (Population): If you have all scores from an entire population, divide the SSSS by the total number of scores (NN). This is the population variance. - Step 4b (Sample): If collecting data from a representative sample, divide the SSSS by the total number of scores minus one (N1N - 1). This is the sample variance. - 5. Square Root: Take the square root of the variance to calculate the standard deviation (SDSD). This "undos" the squaring from step 2. # SD for a Sample of Scores and Degrees of Freedom - The Complication: A sample is a subset of a population and will naturally have slightly less variability. Therefore, calculating sample SD using NN would always underestimate population variability. - The Adjustment: To compensate, statisticians divide the SSSS by N1N - 1 instead of NN. This makes the computed SDSD slightly larger, providing a better estimate of the true population variability. - Degrees of Freedom (dfdf): The value (N1)(N - 1) is formally known as the degrees of freedom. # Application Example: Impulsivity Test - Dataset: 4,1,6,4,54, 1, 6, 4, 5 - Calculation Breakdown: - Mean: (4+1+6+4+5)/5=4(4 + 1 + 6 + 4 + 5) / 5 = 4 - Deviations from Mean: 0,3,2,0,10, -3, 2, 0, 1 - Squared Deviations: 0,9,4,0,10, 9, 4, 0, 1 - Sum of Squares (SS): 1414 - Variance (Sample): 14/(51)=3.514 / (5 - 1) = 3.5 - Standard Deviation (SD): extSD=ext3.5=1.87ext{SD} = ext{\sqrt{3.5}} = 1.87 - Interpretation: The impulsivity scores in the sample differ from the mean by an average of 1.871.87 points. This amount reflects "error variability" or "noise" (chance effects, experimental error, individual differences). - Practice Sets: - Set 1: 12,14,11,10,1312, 14, 11, 10, 13 (Ans: SD=1.58SD = 1.58) - Set 2: 3,5,4,1,63, 5, 4, 1, 6 (Ans: SD=1.92SD = 1.92) - Set 3: 54,61,53,57,5054, 61, 53, 57, 50 (Ans: SD=4.18SD = 4.18) # Z-Scores and Standardization - Definition: A zz-score tells us how far away from the mean any particular score is relative to the variability in the sample (SDSD). It is a ratio of the difference from the mean over the SDSD. - Purpose: We standardise scores to make them comparable and to use standardised tables to determine if a score is significantly different from others. - Z-score Formula: z=scoremeanSDz = \frac{\text{score} - \text{mean}}{\text{SD}} - Probability (pp-value): zz-score tables provide the likelihood (p)(p) of obtaining a specific score by chance. - Significance threshold: If p < .05, the score is significantly different from the mean (unlikely to be chance). # One-Tailed vs. Two-Tailed Tests - One-Tailed Test: Measures the probability of a score falling into one specific side (tail) of the distribution. - Example: A zz-score of 1.261.26 has a one-tailed p=.1038p = .1038. - Two-Tailed Test: Measures the probability of a score falling into either tail (above or below the mean). - The pp-value for a two-tailed test is double that of a one-tailed test. - Example: For z=1.26z = 1.26, the two-tailed p=.2076p = .2076. This makes significance harder to achieve than in a one-tailed test. # Standard Error (SE) vs. Standard Deviation (SD) - SD (Standard Deviation): Tells us how much, on average, scores in a set differ from the mean of that set. It measures random error variability within a set. - SE (Standard Error): Tells us how much, on average, sample means (MM) of a specific size differ from the mean of the larger population (μ\mu). It measures error variability when comparing a sample to a population. - SE Formula: SE=VarianceNSE = \sqrt{\frac{\text{Variance}}{N}} or SE=SDNSE = \frac{\text{SD}}{\sqrt{N}} - Relationships: SESE is directly proportional to population variance and inversely proportional to sample size (NN). Increasing NN decreases SESE. # The Z-Test for a Sample of Scores - Purpose: Used to determine if a sample mean is significantly different from a population mean. - Formula: z=sample meanpopulation meanSEz = \frac{\text{sample mean} - \text{population mean}}{SE} - Critical Values: - Two-tailed critical value: z=1.96z = 1.96 (corresponds to p=.05p = .05). - One-tailed critical value: z=1.64z = 1.64 (corresponds to p=.05p = .05). # Logic of Statistical Difference Tests - Null Hypothesis (H0H_0): Always states that any obtained difference is due solely to error variability/chance. - ¨C44C: Statistical Value=Obtained DifferenceDifference expected due to error variability\text{Statistical Value} = \frac{\text{Obtained Difference}}{\text{Difference expected due to error variability}} - ¨C45C: The resulting value (e.g., z,t,Fz, t, F) represents the number of times greater the obtained difference is compared to the expected error difference under the Null Hypothesis. - ¨C46C: We accept p < .05 as significant. This implies a 5% chance of a Type I error (rejecting the null hypothesis when it is actually true). # Introduction to T-Tests - ¨C47C: zz-tests require knowing the population variance. In real-life research, we rarely have this information and must use sample variances as estimates, necessitating the move to tt-tests. - ¨C48C: Used when data comes from the same participants (within-subjects) or matched pairs. - ¨C49C: Used when data comes from two separate groups of participants (between-subjects). # Related T-Test Calculation - ¨C50C: Find difference scores (DD) for each participant (e.g., ScoreAScoreBScore A - Score B). - ¨C51C: Calculate the mean of difference scores (Dˉ\bar{D}). - ¨C52C: Calculate the variance of difference scores (VarDiffVar_{Diff}) using SS/(N1)SS / (N - 1). - ¨C53C: Calculate Standard Error (SESE): SE=VarDiffNSE = \sqrt{\frac{Var_{Diff}}{N}} - ¨C54C: Calculate tt-statistic: t=DˉSEt = \frac{\bar{D}}{SE} - ¨C55C: df=N1df = N - 1 (where NN is the number of pairs). # Unrelated T-Test Calculation - ¨C56C: Calculate means (MA,MBM_A, M_B) for both separate groups. - ¨C57C: Calculate variances (VarA,VarBVar_A, Var_B) for each group using SS/(n1)SS / (n - 1). - ¨C58C: Calculate Pooled Standard Error (SESE): SE=VarANA+VarBNBSE = \sqrt{\frac{Var_A}{N_A} + \frac{Var_B}{N_B}} - ¨C59C: Calculate Observed Difference: MAMBM_A - M_B. - ¨C60C: Calculate tt-statistic: t=MAMBSEt = \frac{M_A - M_B}{SE} - ¨C61C: df=(NA1)+(NB1)df = (N_A - 1) + (N_B - 1) # Statistical Power and Experimental Design - ¨C62C: Reflects the sensitivity of a test to detect when the Null Hypothesis is untrue. - ¨C63C: Generally more powerful than unrelated designs because they eliminate individual differences. In related designs, participants act as their own controls. - ¨C64C: SESE for unrelated tests reflects individual differences + random error; SESE for related tests reflects only random error. - ¨C65C: Susceptible to carryover effects, fatigue, practice effects, and participants guessing the hypothesis. These are managed through counterbalancing. - ¨C66C: Not subject to carryover effects; higher degrees of freedom. # Reporting T-Test Results - ¨C67C: Include the test type, means (MM), standard deviations (SDSD), degrees of freedom (dfdf), the tt-value, and the pp-value. - ¨C68C: "…revealed no significant difference… (M = 65, SD = 13.02) compared to the group… (M = 71.60, SD = 11.19), t(8) = 0.86, p > .05." - Related Example: "…revealed that money spent when not hungry (M = 6.05, SD = 1.30) was significantly less than when hungry (M = 7.16, SD = 1.12), t(4) = 4.39, p < .05." # Worked Solutions for Practice Set 1 - Related T-test Results: - Mean Difference score = 2.002.00 - SSdiffSS_{diff} = 2.002.00 - VardiffVar_{diff} = 0.500.50 - SESE = 0.320.32 - t-value = 6.256.25 - df = 44 - Critical t-value (tcritt_{crit}) = 2.7762.776 - Unrelated T-test Results: - Mean A = 5.005.00; Mean B = 3.003.00 (Difference = 2.002.00) - VarAVar_A = 4.004.00; VarBVar_B = 2.502.50 - Pooled SE = 1.141.14 - t-value = 1.751.75 - df = 88 - Critical t-value (tcritt_{crit}) = 2.3062.306 # Worked Solutions for Practice Set 2 - Related T-test Results: - Mean Difference score = 2.802.80 - SSdiffSS_{diff} = 10.8010.80 - VardiffVar_{diff} = 2.702.70 - SESE = 0.730.73 - t-value = 3.843.84 - df = 44 - Critical t-value (tcritt_{crit}) = 2.7762.776 - Unrelated T-test Results: - Mean A = 15.8015.80; Mean B = 13.0013.00 (Difference = 2.802.80) - VarAVar_A = 6.206.20; VarBVar_B = 2.502.50 - Pooled SE = 1.3191.319 - t-value = 2.1232.123 - df = 88 - Critical t-value (tcritt_{crit}) = 2.3062.306