An Introduction to Psychological Statistics

Open Educational Resources Collection: An Introduction to Psychological Statistics This textbook, "An Introduction to Psychological Statistics," was published on November 13, 2018, by the University of Missouri-St. Louis. It is part of the IRL @ UMSL Open Educational Resources Collection.

Authors and Affiliations

Garett C. Foster: University of Missouri-St. Louis (fostergc@umsl.edu)
David Lane: Rice University (lane@rice.edu)
David Scott: Rice University
Mikki Hebl: Rice University
Rudy Guerra: Rice University
Dan Osherson: (Further affiliation details not provided)
Heidi Zimmer: (Further affiliation details not provided)
The textbook is available online at: https://irl.umsl.edu/oer/4

Acknowledgments and Licensing

This work was developed as part of the University of Missouri's Affordable and Open Access Educational Resources Initiative (https://www.umsystem.edu/ums/aa/oer).

Adaptation Source: The contents are adapted from "Online Statistics Education: A Multimedia Course of Study" (http://onlinestatbook.com/), led by David M. Lane, Rice University.
Adaptation Details: Dr. Garett C. Foster of the Department of Psychological Sciences made changes to the original open-access resources to tailor the text for an introductory statistics course for psychology majors at the University of Missouri – St. Louis. These changes involved combining, reorganizing, and adding new material.
Responsibility for Errors: Dr. Foster assumes responsibility for any conceptual, mathematical, or typographical errors in this adapted work.
Licensing: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Prologue: A Letter to Students from Garett C. Foster, Ph.D.

Empathy for Students: Dr. Foster acknowledges that statistics can be a challenging and dreaded subject, based on his personal experience as a student.
Personal Transformation: Initially disliked, statistics became deeply valued by Dr. Foster after understanding its underlying logic.
Relevance of Statistics:
- Objective Information Filtering: Statistics provides a crucial method for objectively filtering the constant flood of information encountered in daily life.
- Signal vs. Noise: It helps discern genuine patterns or trends (signal) from mere randomness (noise).
- Logic in Everyday Life: The logical framework of statistics is applicable not only to numerical data but also to general life situations.
Encouragement: Students are encouraged to persist through initial difficulties, with the promise that understanding will eventually lead to a new perspective.
Supportive Stance: Dr. Foster assures students that he is on their side, having experienced similar struggles with the concepts.
Textbook Design: The text is designed to emphasize connections between chapters, topics, and methods, making the material useful and important beyond just formulas.

Table of Contents Overview

This textbook is structured into multiple chapters, covering fundamental statistical concepts and various hypothesis testing methods. The main units are:

Unit 1: Fundamentals of Statistics
- Introduces basic principles, terminology, notation, and the role of statistics in behavioral sciences.
- Concludes with an introduction to foundational probability concepts.
- Serves as building blocks for hypothesis testing.
Unit 2: Hypothesis Testing
- Focuses on applying principles to formal hypothesis testing, particularly regarding means.
Unit 3: Additional Hypothesis Tests
- Continues hypothesis testing with more complex data types, including multiple groups (ANOVA), continuous data relations (correlation/regression), and categorical data (Chi-square).

Chapter 1: Introduction

What are Statistics?

Definition of Statistics: Statistics encompass numerical facts and figures used to analyze, interpret, display, and make decisions based on data.
- Examples of numerical facts:
- Largest earthquake: $9.2$ on the Richter scale.
- Men are at least $10$ times more likely to commit murder than women.
- $1$ in $8$ South Africans is HIV positive.
- By $2020$ , there will be $15$ people aged $65+$ for every new baby born.
Beyond Facts and Figures: The study of statistics involves not just calculations but also critical evaluation of how numbers are chosen and interpreted.
- Example 1: Ice Cream Sales and Advertising:
- Claim: New ad led to a $30\%$ increase in sales for three months (June, July, August).
- Flaw (History Effect): Ice cream consumption naturally increases in summer months, regardless of ads. This falsely attributes outcomes to one variable when another (time-related) is responsible.
- Example 2: Churches and Crime:
- Claim: More churches in a city, more crime; therefore, churches lead to crime.
- Flaw (Third-Variable Problem): Larger populations explain both increased churches and increased crime rates. A third variable (population size) causes both observed trends.
- Example 3: Interracial Marriages:
- Claim: $75\%$ more interracial marriages this year than $25$ years ago; therefore, society accepts interracial marriages.
- Flaw (Insufficient Information): Lacks context on baseline rates or historical fluctuations. If $1\%$ of marriages were interracial $25$ years ago, a $75\%$ increase means only $1.75\%$ (a small proportion) are interracial now, which doesn't necessarily indicate broad acceptance. Also, historical fluctuations are unknown.
Conclusion on Statistics: Statistics is the language of science and data, providing an objective, precise, and powerful tool for communication among researchers and in everyday life.

What Statistics are Not

Misconception: Statistics is purely a math class.
Reality: While math is a central component, statistics is a broader discipline focused on organizing, interpreting, and communicating information objectively. It offers a unique lens for viewing reality, not just manipulating numbers.

Why Do We Study Statistics?

Scientific Communication: Statistics is the fundamental language for communicating findings in behavioral sciences and other scientific fields. It bridges research ideas and actionable conclusions.
Data Interpretation: Essential for making sense of large datasets (hundreds to thousands of observations) that would be unintelligible otherwise.
Personal Empowerment:
- Evaluating Claims: Provides tools to critically evaluate claims and data encountered daily, enabling informed decision-making and preventing manipulation.
- Critical Thinking: Encourages questioning the source, procedures, and context of statistical claims.
- Examples of Statistical Claims (to be critically evaluated):
- $4$ out of $5$ dentists recommend Dentine.
- Almost $85\%$ of lung cancers in men and $45\%$ in women are tobacco-related.
- Condoms are effective $94\%$ of the time.
- People are more persuasive when they look others directly in the eye and speak loudly and quickly.
- Women make $75$ cents to every dollar a man makes for the same job.
- A new study shows eating egg whites can increase life span.
- It is very unlikely there will ever be another baseball player with a batting average over $.400$ .
- There is an $80\%$ chance that in a room of $30$ people, at least two will share the same birthday.
- $79.48\%$ of all statistics are made up on the spot.
Distinguishing Proper vs. Deceptive Use: Learning statistics helps identify both fraudulent claims and valid evidence, making it highly applicable to daily life regardless of career path.
Benjamin Disraeli Quote (via Mark Twain): "There are three kinds of lies -- lies, damned lies, and statistics." Underscores the importance of understanding statistics to avoid deception.

Types of Data and How to Collect Them

Definition of Data: Measured values of variables.
Definition of Variable: A characteristic or feature of interest (e.g., stress, anxiety, physical health in psychology).
Necessity of Understanding Data: Before statistical analysis, it's crucial to understand the nature of data (what it represents and its origin).

Types of Variables

Independent Variable (IV): A variable manipulated by an experimenter to determine its effect.
- Example 1: Blueberries and Aging in Rats
- IV: Dietary supplement (none, blueberry, strawberry, spinach powder).
- Dependent Variables (DV): Memory test, motor skills test.
- Example 2: Beta-Carotene and Cancer
- IV: Supplements (beta-carotene or placebo).
- DV: Occurrence of cancer.
- Example 3: Brake Light Brightness and Reaction Time
- IV: Brightness of brake lights.
- DV: Time to hit brakes.
Dependent Variable (DV): The variable measured to observe the effect of the independent variable.
Levels of an Independent Variable: The number of experimental conditions for an IV.
- e.g., Treatment (experimental, control) has 2 levels.
- e.g., Types of diets (5 types) has 5 levels.

Qualitative and Quantitative Variables

Qualitative Variables (Categorical Variables): Express a non-numerical attribute (e.g., hair color, gender, religion, favorite movie).
- Values do not imply numerical ordering.
- Example: "Type of supplement" in blueberry study (nominal).
Quantitative Variables: Measured in terms of numbers (e.g., height, weight, shoe size, memory performance score).

Discrete and Continuous Variables

Discrete Variables: Possible scores are distinct points on a scale (e.g., number of children: 3 or 6, not 4.53).
Continuous Variables: Scale is continuous, not discrete steps (e.g., time to respond: $1.64$ sec or $1.64237123922121$ sec).
- Practical measurements may limit true continuity.

Levels of Measurement (Scale Types)

Nominal Scales:
- Simply name or categorize responses.
- Examples: Gender, handedness, favorite color, religion.
- Do not imply any ordering among responses.
- Lowest level of measurement.
Ordinal Scales:
- Items in the scale are ordered (e.g., "very dissatisfied" to "very satisfied").
- Allow comparisons of degree (one person is "more satisfied" than another).
- Limitation: Differences between adjacent levels cannot be assumed to be equal intervals.
- Example: Difference between "very dissatisfied" and "somewhat dissatisfied" may not equal difference between "somewhat dissatisfied" and "somewhat satisfied."
- Assigning numbers (1-4) to categories does not change this fundamental property.
Interval Scales:
- Numerical scales where intervals have the same interpretation throughout.
- Example: Fahrenheit temperature scale (difference between $30^{\circ}$ and $40^{\circ}$ is same as $80^{\circ}$ and $90^{\circ}$ ).
- Limitation: Do not have a true zero point.
- $0^{\circ}$ Fahrenheit does not mean absence of temperature.
- Ratios are not meaningful (e.g., $80^{\circ}$ is not "twice as hot" as $40^{\circ}$ because of arbitrary zero point).
Ratio Scales:
- Most informative scale.
- An interval scale with the additional property that its zero position indicates the absence of the quantity being measured.
- Possesses properties of nominal (labels), ordinal (ordering), and interval (equal intervals) scales.
- Ratios are meaningful.
- Example: Kelvin temperature scale (absolute zero means no kinetic energy).
- Example: Amount of money (zero means absence of money; $50$ cents is twice $25$ cents).
Level of Measurement for Psychological Variables:
- Rating Scales: Frequently used (e.g., 5-point, 7-point scales for pain, liking, attitudes).
- Typically considered ordinal scales due to lack of assurance of equal intervals.
- Example: Reducing pain from 3 to 2 might not equal relief from 7 to 6.
- Number of Correctly Recalled Items:
- Can be argued as a ratio scale (true zero, difference of one item is consistent).
- Complication: If items vary in difficulty (e.g., 5 easy, 5 difficult from 10 items list), a difference of one easy item may not represent the same memory difference as one difficult item.
- General Point: It's often inappropriate to consider psychological measurement scales as strictly interval or ratio due to subtle underlying qualitative differences.
Consequences of Level of Measurement:
- The chosen level of measurement dictates which statistics can be meaningfully computed.
- Example: Favorite Color (Nominal Scale)
- If colors are coded (Blue=1, Red=2, Yellow=3, Green=4, Purple=5), calculating an average code (e.g., 3) and concluding the "average favorite color is yellow" is senseless.
- This is because nominal scales lack inherent order or numerical meaning.
- Mean of Ordinal Scale: Debated among statisticians.
- Prevailing Opinion: Meaningful in almost all practical situations.
- Caution: Can be very misleading in extreme situations.

Collecting Data

Population of Interest (Population): The entire collection of people sharing a common characteristic that researchers aim to understand.
- Can be broad (e.g., "all people") or narrow (e.g., "all freshmen psychology majors at Midwestern public universities").
Sample: A small subset of data drawn from a larger set (the population).
Inferences: Conclusions drawn about the population based on observations from the sample.

Populations and Samples

Example 1: American Voting Attitudes
- Goal: Examine how Americans feel about voting fairness.
- Impracticality: Cannot query every American.
- Strategy: Query a relatively small sample and infer to the entire U.S. population.
- Crucial Point: Sample must not over-represent one kind of citizen.
- Sampling Bias: Occurs if a sample is not representative (e.g., only Floridians, only Republicans). Conclusions apply only to the biased sample, not generalizable to the population.
Example 2: Math Classes Taken by Graduating Seniors
- Population: Graduating seniors at American colleges/universities.
- Strategy: Sample colleges/universities, then sample students from those institutions.
- Risk: Bias if selection favors math majors or technical institutions.
Example 3: Substitute Teacher's Test Score Assessment
- Population: All students in the class.
- Sample: 10 students in the front row.
- Problem: Front-row students tend to be more engaged and perform better, leading to a biased sample not representative of the whole class.
Example 4: Coach's Cartwheel Assessment
- Population: All freshmen at the university.
- Sample: 8 volunteers.
- Problem: Volunteers are likely better at cartwheels, biasing the sample. Gender of volunteers also a potential confound.

Simple Random Sampling

Definition: Every member of the population has an equal chance of being selected.
Independence: Selection of one member does not affect the probability of selecting any other member.
Mechanism: Chooses samples by pure chance.
Example 5: Twin Study from National Twin Registry
- Population: All twins recorded in the National Twin Registry (generalizations limited to this list).
- Sampling Procedure: Selecting only last names starting with 'Z', then only every other name starting with 'B'.
- Problems:
- Not every individual has an equal chance (e.g., 'Z' names selected, other names not).
- Risks over-representing ethnic groups or creating other biases (e.g., patient people with 'Z' names).
- "Every-other-one" for 'B' names violates independence of selection.

Sample Size Matters

Random Sample Definition: Based on the procedure (equal chance, independence), not the results.
Representativeness: Small random samples are not guaranteed to be representative.
- Example: A random sample of $20$ subjects from a $50\%$ -male, $50\%$ -female population has a nontrivial probability ( $0.06$ ) of being $70\%$ or more female.
Inferential Statistics and Sample Size: Larger sample sizes increase the likelihood of representativeness and are accounted for in inferential statistics.

More Complex Sampling

Challenge of Simple Random Sampling: Often not feasible in real-world scenarios (e.g., surveying all Texans about Olympic host preference).
Stratified Sampling:
- Purpose: Ensures a more representative sample when the population has distinct subgroups (strata).
- Procedure: Identify members of each subgroup, then randomly sample from each subgroup in proportion to its size in the population.
- Example: Studying views on capital punishment at a university with $70\%$ day students (average age $19$ ) and $30\%$ night students (average age $39$ ).
- To interview $200$ students: $140$ day students, $60$ night students.
- Ensures proportions in sample match population, making inferences more secure.
Convenience Sampling:
- Definition: Easily accessible individuals are chosen.
- Justification: Acceptable for exploratory research in completely unstudied areas, providing quick initial data before investing in more rigorous research with representative samples.
- Risk: Often used due to convenience alone, without intent for future improvement, leading to non-generalizable results.

Type of Research Designs

Determinants of Design Choice: Research question and logistical constraints.
Three Types Discussed: Experimental, quasi-experimental, and non-experimental.

Experimental Designs

Purpose: To determine if a change in one variable causes a change in another.
Key Characteristics:
1. Random Assignment to Treatment Conditions: Participants are randomly assigned to different groups.
2. Manipulation of the Independent Variable: The experimenter actively changes the IV.
Example: New Flu Drug
- Design: $40$ flu patients randomly assigned to Group A (new drug) or Group B (placebo).
- IV Manipulation: Researchers administer either the new drug or placebo.
- Random Assignment: Ensures no systematic differences between groups other than the drug.
- DV: Flu symptoms after $1$ week.
- Causality: If a difference in symptoms is found, it can be confidently attributed to the new drug.
Random Sampling vs. Random Assignment:
- Random Sampling: Makes research representative of the population.
- Random Assignment: Allows researchers to infer causality by eliminating systematic differences between groups.
- Both together minimize sampling bias and allow for strong causal inferences.

Quasi-Experimental Designs

Definition: Attempts to approximate conditions of a true experiment when random assignment is not possible.
Key Characteristic: Manipulates the independent variable but without random assignment.
Reasons for Use:
- Ethical concerns (e.g., denying potential treatment).
- Impossibility of random assignment.
Example: New Teaching Method
- Design: Professor uses a new method in one section and a traditional method in another section of the same course.
- IV Manipulation: Teaching method is manipulated.
- Lack of Random Assignment: Students enroll in courses, so they cannot be randomly assigned to sections.
- Causality Limitation: Cannot establish causality because systematic differences between classes (other than teaching style) cannot be ruled out.

Non-Experimental Designs (Correlational Research)

Definition: Involves observing variables as they occur naturally and recording observations as data.
Key Characteristic: No manipulation of independent variables; no random assignment.
Example: Conscientiousness and Job Performance
- Design: Data scientist measures conscientiousness and job performance ratings from volunteer employees.
- Variables: Cannot manipulate conscientiousness.
- Purpose: To find a relation between variables for prediction.
- Causality Limitation: Cannot establish causality.
Usefulness: Can be valuable for prediction even without causality.
- If conscientiousness consistently predicts job performance, it can still be used for screening applicants.
Benefit: Reflects reality as it actually exists, as researchers do not alter anything.

Types of Statistical Analyses

Two main types of statistics are used to interpret data: descriptive and inferential.

Descriptive Statistics

Definition: Numbers used to summarize and describe data.
Data: Information collected from experiments, surveys, historical records, etc.
- "Data" is plural; a single piece of information is a "datum."
Purpose: Provides a full picture of the data at hand; does not involve generalizing beyond the current dataset.
Examples:
- Percentage of birth certificates issued in New York State.
- Average age of mothers.
Table 1: Average Salaries for Various Occupations (1999, U.S.)
- Pediatricians: 112,760
- Dentists: 106,130
- Podiatrists: 100,090
- Physicists: 76,140
- Architects: 53,410
- School, Clinical, Counseling Psychologists: 49,720
- Flight Attendants: 47,910
- Elementary School Teachers: 39,560
- Police Officers: 38,710
- Floral Designers: 18,980
Insight: Highlights societal values (e.g., lower pay for educators/protectors compared to some medical professions).
Table 2: Unmarried Men per $100$ Unmarried Women in U.S. Metro Areas (1990)
- Cities with mostly men: Jacksonville, NC ( $224$ ); Killeen-Temple, TX ( $123$ ); Fayetteville, NC ( $118$ ); Brazoria, TX ( $117$ ); Lawton, OK ( $116$ ); State College, PA ( $113$ ); Clarksville-Hopkinsville, TN-KY ( $113$ ); Anchorage, Alaska ( $112$ ); Salinas-Seaside-Monterey, CA ( $112$ ); Bryan-College Station, TX ( $111$ ).
- Cities with mostly women: Sarasota, FL ( $66$ ); Bradenton, FL ( $68$ ); Altoona, PA ( $69$ ); Springfield, IL ( $70$ ); Jacksonville, TN ( $70$ ); Gadsden, AL ( $70$ ); Wheeling, WV ( $70$ ); Charleston, WV ( $71$ ); St. Joseph, MO ( $71$ ); Lynchburg, VA ( $71$ ).
- Note: "Unmarried" includes never-married, widowed, and divorced persons aged 15 or older.
- Speculation: Higher female population in places like Sarasota, FL, might be due to elderly individuals moving there and women generally outliving men. This is speculation without further data.
Table 3: Winning Olympic Marathon Times (Men & Women)
- Men (1896-2004) & Women (1984-2004) Data: Records winners, country, and time.
- Insight into Speed Improvement: Comparing mean winning times for men (first 13 races vs. second 13 races) shows a significant decrease (e.g., $2:44:22$ vs. $2:13:18$ , a difference of over half an hour).
- Observation: Female winner Naoko Takahashi (2000) would have beaten all male winners up to 1956. This raises questions about closing gender gaps and future limits.
Population Parameter vs. Sample Statistic:
- Parameter: The true value of a descriptive measure in the population; can never be known for sure.
- Example: Bureau of Labor Statistics' average hourly wage of chefs ( $23.87$ ) as a hypothetical parameter.
- Statistic: An estimate of the true population parameter, computed from sample data.
- The term "statistic" here refers to the specific calculated number (e.g., the average), not the field of statistics.
- Good Estimator: A sample statistic is a good estimator if the sample is representative of the population.
- Sampling Error: The tiny discrepancy between a population parameter and its sample statistic, caused by random chance in sampling.
- Understanding sampling error is crucial for statistics.
- Every observed value or behavior has sampling error; the challenge is to distinguish unusual observations from true differences.

Inferential Statistics

Purpose: To understand how data behave and generalize from a sample to a broader population.
Questions Addressed:
- Which variables are related?
- Under what conditions does a variable's value change?
- Are two groups different, and are individuals within groups different/similar?
Process: Formal analyses and tests run to describe data.
Examples of Tests:
- t-statistic: To determine if people change over time (e.g., in an intervention).
- F-statistic: To predict future values of a variable based on current known values.
Scope of Course: This course covers a subset (sample) of inferential statistics, but the principles learned are widely applicable to new tests.

Mathematical Notation

Statistics and Math: Math is a tool used in statistics, but statistics is a broader field.
Summation Notation ( $\Sigma$ ): Convenient for expressing the sum of numbers.
Example: Weights of $4$ Grapes (Variable X)
- Grape 1: $X_1 = 4.6$
- Grape 2: $X_2 = 5.1$
- Grape 3: $X_3 = 4.9$
- Grape 4: $X_4 = 4.4$
Sum of all values: $\sum{i=1}^{4} Xi = X1 + X2 + X3 + X4 = 4.6 + 5.1 + 4.9 + 4.4 = 19.0$
- $\Sigma$ : Greek letter indicating summation.
- $i=1$ at bottom: Summation starts with $X_1$ .
- $4$ at top: Summation ends with $X_4$ .
- $X_i$ : Variable to be summed as $i$ goes from $1$ to $4$ .
Sum of first $3$ scores: $\sum{i=1}^{3} Xi = X1 + X2 + X_3 = 4.6 + 5.1 + 4.9 = 14.6$
Abbreviated Notation: $\sum X$ means sum all values of X when no values of $i$ are shown.
Squaring Numbers before Summation:
- $\sum X^2 = X1^2 + X2^2 + X3^2 + X4^2$
- $= 4.6^2 + 5.1^2 + 4.9^2 + 4.4^2 = 21.16 + 26.01 + 24.01 + 19.36 = 90.54$
Important Distinction: $\sum X^2 \neq (\sum X)^2$
- $\sum X^2:$ Square numbers first, then sum (e.g., $90.54$ ).
- $(\sum X)^2:$ Sum numbers first, then square the sum (e.g., $(19)^2 = 361$ ).
Sum of Cross Products (XY):
- Calculated by multiplying corresponding X and Y values for each observation, then summing the products.
- Example Data:
- X=1, Y=3; XY=3
- X=2, Y=2; XY=4
- X=3, Y=7; XY=21
- $\sum XY = 3 + 4 + 21 = 28$

Chapter 2: Describing Data using Distributions and Graphs

Graphing Qualitative Variables

Context: Displaying data that fall into a small number of categories without a pre-established ordering (e.g., previous computer ownership for iMac study).
Contrast: Quantitative data (e.g., weight) has a natural ordering.

Frequency Tables

Purpose: Basis for all graphical methods for qualitative data.
Components:
- Category: The different qualitative attributes.
- Frequency: The count of responses in each category.
- Relative Frequency: The proportion of responses in each category.
- Calculated as Frequency / Total Number of Observations.
- Example: iMac study, "none" = $85/500 = 0.17$
Example: iMac Data Frequency Table
- Previous Ownership | Frequency | Relative Frequency
- ---|---|---
- None | $85$ | $0.17$
- Windows | $60$ | $0.12$
- Macintosh | $355$ | $0.71$
- Total | $500$ | $1$

Pie Charts

Representation: Each category is a slice of the pie.
Area Proportionality: Area of slice is proportional to the percentage of responses in the category (Relative Frequency $\times 100$ ).
Example: iMac Purchases (Figure 1)
- Most purchasers ( $71$ %) were Macintosh owners.
- Apple was encouraged by $12$ % former Windows users and $17$ % first-time buyers.
Effectiveness: Effective for displaying relative frequencies of a small number of categories.
Limitations:
- Not recommended for a large number of categories.
- Can be confusing when comparing outcomes of two different surveys.
- Edward Tufte's assertion: "The only worse design than a pie chart is several of them."
Misleading with Small N: If based on a small number of observations, labeling slices with percentages can be misleading due to chance variations. Better to label with actual frequencies.

Bar Charts

Representation: Bars represent frequencies of different categories.
Axes: Frequencies on Y-axis, categories on X-axis.
Example: iMac Purchases (Figure 2)
- Illustrates the same data as the pie chart.
Key Distinction from Histograms: Bars do not touch in a bar chart (for qualitative data).

Comparing Distributions

Usefulness: Bar charts are excellent for illustrating differences between two distributions.
Example: Card Game Players on Yahoo (Sunday vs. Wednesday, Figure 3)
- Overall more players on Wednesday.
- Pinochle players: same on both days.
- Hearts players: twice as many on Wednesday.
Horizontal Format: Useful when many categories exist, providing more room for labels.

Some Graphical Mistakes to Avoid

Don't Get Fancy!
- 3-D Bar Charts (Figure 4): Usually less effective and more distracting than 2-D counterparts.
- Substituting Images for Bars (Figure 5): While heights may be accurate, areas can exaggerate size differences, leading to distortion.
- Lie Factor (Edward Tufte): Ratio of effect size in graph to effect size in data. Values outside $0.95-1.05$ indicate unacceptable distortion.
- Example: Mac vs. Windows owners (6:1 ratio in actual data, 35:1 ratio by area in Figure 5).
- Non-Zero Baseline (Figure 6): Setting the bottom of the Y-axis to a value other than zero disproportionately distorts perceived differences.
- Example: iMac data with baseline of 50 makes Windows-switchers appear minuscule.
- Line Graphs for Qualitative Variables (Figure 7): Inappropriate, as they imply a numerical ordering that does not exist.

Summary

Pie charts and bar charts are effective for qualitative data.
Bar charts are better for more categories and for comparing multiple distributions.
Always be careful to avoid misleading graphs.

Graphing Quantitative Variables

Definition: Variables measured on a numeric scale (e.g., height, weight, test scores).
Distinction: Different from categorical (qualitative) variables.
Graph Types Covered: Stem and leaf displays, histograms, frequency polygons, box plots, bar charts, line graphs, dot plots, scatter plots (later chapter).
Selection Criteria: Amount of data (small/moderate vs. large), purpose (depicting differences, showing relationships).

Stem and Leaf Displays

Purpose: Graphical display useful for small to moderate datasets.
Construction:
- Stems: Left portion (e.g., 10's digits), arranged as a column.
- Leaves: Right portion (e.g., 1's digits), extending from each stem.
- Each leaf represents its value added to (10 $\times$ its stem).
Example: NFL Touchdown Passes (Table 2, Figure 7) Data: $37, 33, 33, 32, \dots, 6$
- Stem 3: Leaves 2,3,3,7 represent $32, 33, 33, 37$ .
- Stem 0: Leaves 6,9 (representing $06, 09$ ) are $6, 9$ .
Interpretation: Clarifies the shape of distribution, making it easy to see data clustering (e.g., most teams had $10-29$ passes).
Splitting Stems (Figure 8):
- Can make the display more revealing if rows are too long.
- Example: Stem 3 split into two parts: one for $35-39$ , one for $30-34$ .
Back-to-Back Stem and Leaf Display (Figure 9):
- Useful for comparing two distributions.
- Common column of stems in the middle, leaves for one dataset to the left, leaves for the other to the right.
- Example: Comparing NFL TD passes in 1998 vs. 2000 seasons.
- Reveals similarities but also that only in 1998 were >40 TD passes thrown.
Handling Diverse Data:
- Three+ Digits or Decimals: Round to two-digit accuracy.
- Negative Values: Use negative stems (e.g., -0 for $0$ to $-9$ ).
- Example: Weapons and Aggression Study (Table 3, Figure 10)
- Data range: $43.2$ to $-27.4$ milliseconds (priming effect).
- Rounded values are plotted (e.g., $43.2 \to 43$ ).
- Rows for "0" (for $0-9$ ) and "-0" (for $0-{-9}$ ) are used.
Limitations: Unwieldy for very large datasets (>200 observations).
Judgment: Requires good judgment to decide if data can be suitably represented without losing important information.

Histograms

Purpose: Graphical method for displaying the shape of a distribution, especially useful for large numbers of observations.
Construction Steps:
1. Create a frequency table: Group scores into class intervals (e.g., $39.5-49.5$ .
- Class Intervals (Bins): Ranges of scores.
- Width of $10$ is often a good start.
- Limits midway between two numbers (e.g., $49.5$ ) ensure all scores fall within an interval.
1. Count Class Frequencies: Number of scores in each interval.
2. Draw Bars: Height of each bar corresponds to its class frequency.
Example: Psychology Test Scores (Table 4, Figure 12)
- $642$ students, scores $46-167$ .
- Histogram clearly shows most scores in the middle, fewer at extremes.
Skew: Distribution is not symmetric (scores extend farther right), indicating a skewed distribution.
Continuous Data: Histograms work for continuous data (e.g., response time).
- Whole numbers can be used as boundaries (e.g., $4000-4999$ ms).
- Some programs label the middle of each interval.
Relative Frequencies: Can be based on proportions instead of raw counts.
- Divide each class frequency by total observations.
- Y-axis scale changes to $0-1$ .
Bin Width (Class Interval Width):
- Choice affects the histogram's shape.
- Experiment with different widths to best communicate the distribution's shape.

Frequency Polygons

Purpose: Understanding distribution shapes, especially good for comparing data sets or displaying cumulative frequency distributions.
Construction:
1. Choose class intervals (same as histograms).
2. Draw X-axis: Represents score values, tick marks at the middle of each interval.
3. Draw Y-axis: Indicates frequency.
4. Place Points: At the middle of each class interval, at the height of its frequency.
5. Connect Points: Include one interval below the lowest and one above the highest value (with frequency $0$ ) so the graph touches the X-axis.
Example: Psychology Test Scores (Table 5, Figure 13)
- Clear view of distribution shape: most scores between $65$ and $115$ .
- Shows positive skew (good scores trail off more gradually).
Cumulative Frequency Polygon (Figure 14):
- Y-value for each point is the cumulative count of scores in that interval and all lower intervals.
- Last interval's cumulative frequency equals total participants.
Comparing Distributions (Figure 15 - cursor task data):
- Overlaying multiple frequency polygons clearly shows differences.
- Example: Moving cursor to small target took longer than to large target.
Overlaid Cumulative Frequency Polygons (Figure 16): Similarly effective for comparison.

Box Plots

Purpose: Identifying outliers and comparing distributions.
Construction (using 25th, 50th, 75th percentiles/hinges):
1. Draw Box: Extends from the 25th percentile (lower hinge) to the 75th percentile (upper hinge).
2. Draw Median Line: The 50th percentile is drawn inside the box.
Table 6: Women's times data example. $25^{th}\%=17$ , $50^{th}\%=19$ , $75^{th}\%=20$ .
Terminology (Table 7 - for women's times):
- Upper Hinge: 75th Percentile ( $20$ )
- Lower Hinge: 25th Percentile ( $17$ )
- H-Spread: Upper Hinge - Lower Hinge ( $3$ )
- Step: $1.5 \times$ H-Spread ( $4.5$ )
- Upper Inner Fence: Upper Hinge + 1 Step ( $24.5$ )
- Lower Inner Fence: Lower Hinge - 1 Step ( $12.5$ )
- Upper Outer Fence: Upper Hinge + 2 Steps ( $29$ )
- Lower Outer Fence: Lower Hinge - 2 Steps ( $8$ )
- Upper Adjacent: Largest value below Upper Inner Fence ( $24$ )
- Lower Adjacent: Smallest value above Lower Inner Fence ( $14$ )
- Outside Value: Beyond an Inner Fence but not beyond an Outer Fence (e.g., $29$ for women)
- Far Out Value: Beyond an Outer Fence (None in example)
Whiskers (Figure 18): Vertical lines from hinges to adjacent values, ending in horizontal strokes.
Outliers (Figure 19): Outside values shown as small "o's"; far out values as asterisks (*).
Mean (Figure 20): Often indicated by a plus sign (+).
Interpretation (Figure 20, comparison of men's and women's times):
- Half of women's times: $17-20$ s.
- Half of men's times: $19-25.5$ s.
- Women generally faster, though one woman was slower than almost all men (an outlier).
Shape Information (Figure 21 showing labels):
- Positive skew: Longer whisker in positive direction, mean > median.
Benefits: Good at portraying extreme values and showing differences between distributions.
Limitations: Hides many distribution details; use histograms/stem-and-leaf for more detail.

Bar Charts for Quantitative Variables

Initial Use: Introduced for qualitative variable frequencies (Figure 22 - iMac buyers).
New Use: Can present other quantitative information beyond frequencies.
Example: Stock Index Percent Increases (Figure 23): Y-axis shows percent increase.
- S&P and Nasdaq had "negative increases" (decreased).
Showing Change Over Time (Figure 24 - CPI percent increase): Effective for demonstrating fluctuations.
Comparing Means (Figure 25 - cursor movement times): Can show mean differences.
Recommendation: Box plots (Figure 26) are preferred for comparing means as they offer more information (spread, outliers) without taking more space.
Graphical Mistakes: Previous discussion on qualitative bar charts (3D, images, non-zero baseline) applies here too.

Line Graphs

Definition: A bar graph where bar tops are points joined by lines (bars suppressed).
Example: CPI Percent Change (Figure 27 - bar chart vs. Figure 28 - line graph): Line graph emphasizes change over time.
Appropriate Use: Only when both X- and Y-axes display ordered (quantitative) variables.
- Better than bar charts for comparing changes over time (Figure 29 - CPI components).
Misleading Use: Inappropriate when X-axis contains qualitative variables (Figure 30 - card game data), falsely implying numerical order.

The Shape of Distribution

Primary Concern: Symmetrical vs. skewed.
Symmetrical Distribution: Can be cut down the center to form two mirror images.
- In practice, rarely perfectly symmetrical; seek closeness.
Normal Distribution (Figure 31): Common and pertinent.
- Single peak (center) and two tails extending equally.
- "Bell shape" or "bell curve."
Bimodal Distribution (Figure 32): Symmetrical with multiple peaks (two).
- Not desirable; difficult to detect numerically without visualization.
Skewed Distributions (Not Symmetrical):
- One tail is disproportionately longer than the other.
- Affects averages, making them inaccurate representations.
Positive (Right) Skew (Figure 33): Longer tail extends to the right.
Negative (Left) Skew (Figure 34): Longer tail extends to the left.
Identifying Skew: Based on the longer tail, not the bulk of the data (body).

Chapter 3: Measures of Central Tendency and Spread

What is Central Tendency?

Intuitive Understanding: Where an individual score compares to a distribution of scores (e.g., your quiz score vs. class scores).
Example (Table 1 - Pop Quiz Scores):
- Dataset A (all 3s): Your score is at the exact center; you did as well as everyone.
- Dataset B (3,4,4,4,5): Your score of 3 is below the center; depressing.
- Dataset C (3,2,2,2,1): Your score of 3 is above the center; impressive.
Formal Definition: The center of a distribution can be defined in multiple ways, all referred to as measures of central tendency.
- Balance Scale (Figure 2-5): The point at which the distribution would be perfectly balanced if scores had equal weight.
- For symmetric distributions, it's the geometric middle.
- For asymmetric distributions, the balance point shifts towards the longer tail.
- Smallest Absolute Deviation: The score for which the sum of the absolute differences from all other scores is minimized.
- Example: For $2,3,4,9,16$ , the sum of absolute deviations from $10$ is $28$ , from $5$ is $21$ (closer to center). The true minimum for these data is achieved at score 4 ( $|2-4|+|3-4|+|4-4|+|9-4|+|16-4| = 2+1+0+5+12=20$ ).
- Smallest Squared Deviation: The score for which the sum of the squared differences from all other scores is minimized.
- Example: For $2,3,4,9,16$ , sum of squared deviations from $10$ is $186$ , from $5$ is $151$ . The true minimum for these data is achieved at the mean (6.8), summing to 134.8.

Measures of Central Tendency

Arithmetic Mean (Average)

Most Common Measure: Simple sum of numbers divided by the count of numbers.
Symbols:
- $\mu$ (mew): Population mean.
- $\bar{X}$ (X-bar): Sample mean.
Formulas:
- Population: $\mu = \sum X / N$
- Sample: $\bar{X} = \sum X / N$
Distinction: Refers to population parameter vs. sample statistic.
Example: Mean of $1,2,3,6,8$ is $20/5 = 4$ .
Example: NFL Touchdown Passes (Table 4) Data: $31$ teams, sum of passes is $634$
- $\mu = 634 / 31 = 20.45$
General Terminology: Unless specified, "mean" refers to the arithmetic mean.

Median

Definition: Midpoint of a distribution; same number of scores above and below it (50th percentile).
Calculation:
- Odd Number of Scores: Middle number (e.g., median of $2,4,7$ is $4$ ).
- Even Number of Scores: Mean of the two middle numbers (e.g., median of $2,4,7,12$ is $(4+7)/2 = 5.5$ ).
- Duplicate Values: Each instance of a value is counted (e.g., median of $1,3,4,4,5,8,9$ is $4$ ). Write out all numbers in order and cross off from ends to find it.
Example: NFL Touchdown Passes (Table 4):
- $31$ scores, $16^{th}$ highest score is $20$ . So, median is $20$ .

Mode

Definition: The most frequently occurring value in the dataset.
Example: NFL Touchdown Passes (Table 4):
- Mode is $18$ (occurs $4$ times).
Continuous Data:
- Often, no two scores are exactly the same.
- Mode is typically computed from a grouped frequency distribution.
- Example (Table 5): For response time data, interval $600-700$ has highest frequency; mode is middle of interval ( $650$ ).
Application: Only measure of central tendency usable for qualitative or categorical data.

Comparing Measures of Central Tendency

Symmetric Distributions: Mean = Median = Mode (except in bimodal cases).
Skewed Distributions: Measures differ due to the pull of the tail.
- Positive Skew (Figure 7: Psychology Test Scores)
- Mean ( $91.58$ ) > Median ( $90.00$ ) > Mode ( $84.00$ ).
- Mode is at the peak.
- Median is pulled slightly into the longer (right) tail.
- Mean is pulled farthest into the longer (right) tail.
- Sensitivity: Mean is most sensitive to skew.
- Extreme Skew (Figure 8: Baseball Salaries)
- Pronounced positive skew.
- Mode ( $250,000$ ), Median ( $500,000$ ), Mean ( $1,183,000$ ).
- No single measure is sufficient; means are misleading alone.
- Recommendation: Report both mean and median for skewed distributions, sometimes the mode too.
- Media typically reports median for skewed data (e.g., median salaries).

Spread and Variability

Definition: How "spread out" a group of scores is (synonyms: variability, dispersion).
Example (Figure 9.1 & 9.2 - Quiz Scores):
- Both quizzes have mean $7.0$ .
- Quiz 1 scores are densely packed; Quiz 2 scores are more spread out.
Measures of Variability: Range, variance, standard deviation.

Range

Simplest Measure: Highest score minus the lowest score.
Example 1: $10,2,5,6,7,3,4$ ; Range = $10 - 2 = 8$ .
Example 2: $99,45,23,67,45,91,82,78,62,51$ ; Range = $99 - 23 = 76$ .
Quiz 1 vs. Quiz 2 (Figure 9):
- Quiz 1: Range = $9 - 5 = 4$ .
- Quiz 2: Range = $10 - 4 = 6$ (larger spread).
Problem: Extremely sensitive to outliers.
- Example: In $1,3,4,4,5,8,9$ , range is $8$ . Add $20$ , range becomes $19$ .

Interquartile Range (IQR)

Definition: Range of the middle $50\%$ of scores.
Calculation: $IQR = 75^{th} \text{ percentile} - 25^{th} \text{ percentile}$ .
- Often called H-spread in box plots (upper hinge - lower hinge).
Example:
- Quiz 1: $75^{th}\%=8$ , $25^{th}\%=6$ ; IQR = $2$ .
- Quiz 2: $75^{th}\%=9$ , $25^{th}\%=5$ ; IQR = $4$ (greater spread).

Sum of Squares (SS)

Definition of Variability: How close scores are to the middle (mean) of the distribution.
Calculation Steps (Table 9 - Quiz 1 scores):
1. X Column: Raw data scores.
2. $\bar{X}$ Calculation: Sum X, divide by N.
3. $(X - \bar{X})$ Column (Deviations): Each score minus the mean. Sum will be $0$ .
4. $(X - \bar{X})^2$ Column (Squared Deviations): Square each deviation.
5. Sum of Squares (SS): Sum of the squared deviations, $\sum (X - \bar{X})^2$ .
Importance: SS is a fundamental value appearing in many formulas.
Example: Quiz 1 data (N=20, mean=7.0)
- SS = $30$ .

Variance

Definition: Average squared difference of scores from the mean.
Population Parameter ( $\sigma^2$ ): $\sigma^2 = \sum (X - \mu)^2 / N$
- Numerator is Sum of Squares. Divide by N.
- Example (assuming Quiz 1 data is population): $\sigma^2 = 30 / 20 = 1.5$
Sample Statistic ( $s^2$ ): $s^2 = \sum (X - \bar{X})^2 / (N - 1)$
- Key Difference: Divides by N – 1 (degrees of freedom, df) instead of N.
- $df = N - 1$
- Formula can be shorthand as $SS/df$ .
Example (assuming Quiz 1 data is sample): $s^2 = 30 / (20-1) = 30 / 19 = 1.58$
- Sample variance ( $1.58$ ) is slightly larger than population variance ( $1.5$ ) due to smaller denominator, providing a less biased estimate.
Law of Large Numbers Link: Larger sample sizes (N) make N-1 closer to N, bringing $s^2$ closer to $\sigma^2$ . Larger N --> more representative sample.
Robustness: More robust to outliers than range.
Importance: Plays a central role in inferential statistics.

Standard Deviation

Definition: Simply the square root of the variance.
Interpretability: Puts the measure of spread back into the original units of the variable.
Reporting: Almost always reported with the mean in descriptive statistics.
Population Parameter ( $\sigma$ ): $\sigma = \sqrt{\sum (X - \mu)^2 / N}$ or $\sqrt{\sigma^2}$ .
Sample Statistic ( $s$ ): $s = \sqrt{\sum (X - \bar{X})^2 / (N - 1)}$ or $\sqrt{SS/df}$ .
Example:
- Population SD (from Quiz 1): $\sqrt{1.5} = 1.22$
- Sample SD (from Quiz 1): $\sqrt{1.58} = 1.26$
Usefulness (Normal Distributions): Especially useful when distribution is normal.
- $68\%$ of distribution within $1$ standard deviation of the mean.
- $\sim95\%$ of distribution within $2$ standard deviations of the mean.
Example: Normal distribution with mean $50$ , SD $10$
- $68\%$ between $40$ and $60$ .
- $95\%$ between $30$ and $70$ .
Visualizing SD (Figure 11, 12): Smaller standard deviation means a narrower distribution, regardless of mean.

Chapter 4: z-scores and the Standard Normal Distribution

Normal Distributions

Importance: Most important and widely used distribution in statistics.
Names: "Bell curve," "Gaussian curve" (after Karl Friedrich Gauss).
Characteristics:
1. Symmetric around their mean.
2. Mean, median, and mode are equal.
3. Area under the normal curve equals $1.0$ .
4. Denser in the center, less dense in the tails.
5. Defined by two parameters: mean ( $\mu$ ) and standard deviation ( $\sigma$ ).
6. $68\%$ of area within one standard deviation of the mean.
7. Approximately $95\%$ of area within two standard deviations of the mean.
Variations (Figure 1): Normal distributions can differ in mean and standard deviation.
- Example: Green ( $\mu=-3, \sigma=0.5$ ), Red ( $\mu=0, \sigma=1$ ), Black ( $\mu=2, \sigma=3$ ).
Consistency: All normal distributions share the same shape and proportion of scores within a given distance along the x-axis.
Focus: Standard Normal Distribution (Unit Normal Distribution): Mean of $0$ , standard deviation of $1$ (red distribution in Figure 1).

z-scores

Definition: A standardized version of a raw score (x) that indicates its relative location within a distribution.
Formulas:
- Population: $z = (x - \mu) / \sigma$
- Sample: $z = (x - \bar{X}) / s$
Information Conveyed by z-score:
- Sign (positive/negative): Indicates which half of the distribution the score falls in.
- Positive: Above the mean, right-hand side/upper end.
- Negative: Below the mean, left-hand side/lower end.
- Magnitude (actual number): Tells how many standard deviations the score is away from the mean.
- Can range from negative to positive infinity, typically between $-3$ and $3$ .
Interpretation Examples:
- $z = -1.0$ : $1$ standard deviation below the mean.
- $z = 1.0$ : $1$ standard deviation above the mean.
- $z = -2.5$ : $2.5$ standard deviations below the mean (more extreme).
- $z = 0.25$ : $0.25$ standard deviations above the mean (closer to center).
Rough Cut-off for Extremity: |z| > 1.5 indicates an "extreme" score; $|z| \le 1.5$ indicates a "close" score.
Converting Raw Scores to z-scores:
- Purpose: To understand a score's relative location.
- Example: Exam score of $68$ . Class mean ( $\mu$ ) = $54$ , SD ( $\sigma$ ) = $8$ .
- $z = (68 - 54) / 8 = 1.75$
- Interpretation: Score is $1.75$ standard deviations above average, indicating a good performance (above the $1.5$ cutoff).
Visual Representation (Figure 2): Raw score and z-score on their respective distributions maintain the same relative spot.
- Transformation does not change relative location.
Comparing Scores from Different Distributions:
- Example: SAT Math ( $\mu=511, \sigma=120$ ) vs. Critical Reading ( $\mu=495, \sigma=116$ ). Score $501$ on both.
- Math $z_{math} = (501 - 511) / 120 = -0.08$ (slightly below average).
- Critical Reading $z_{CR} = (501 - 495) / 116 = 0.05$ (slightly above average).
- Conclusion: Better performance on Critical Reading relative to others.
Combining Information from Different Scales (Table 1 & 2):
- Problem: Averaging raw scores from different scales (e.g., Job Knowledge $0-100$ , Personality $1-5$ , Leadership $1-5$ ) leads to one scale overpowering others, reducing variability in the average.
- Solution (Table 2): Standardizing to z-scores ( $\mu=0, \sigma=1$ ) makes scales comparable.
- Averages of z-scores retain more variability, allowing for better assessment of differences between employees.

Setting the Scale of a Distribution

Purpose: Convert z-scores back to a raw score scale with a desired mean and standard deviation (e.g., to avoid negative numbers or fit a specific range).
Formulas (rearranged from z-score formula):
- Population: $x = z\sigma + \mu$
- Sample: $x = zs + \bar{X}$
Example: Converting z-scores to IQ scores ( $\mu=100, \sigma=16$ )
- New intelligence measure: mean $40$ , SD $7$ .
- Raw scores: $52, 43, 34$
- Converted z-scores: $1.71, 0.43, -0.80$
- Converted IQ scores:
- $1.71 \times 16 + 100 = 127.36 \approx 127$
- $0.43 \times 16 + 100 = 106.88 \approx 107$
- $-0.80 \times 16 + 100 = 87.20 \approx 87$

Z-scores and the Area under the Curve

Relationship: z-scores precisely locate values within the standard normal distribution.
- Any normal distribution can be standardized (all scores converted to z-scores) into a standard normal distribution.
Proportion of Area and Probability:
- Recall: $68\%$ of scores fall between $\text{z}=-1.0$ and $\text{z}=1.0$ .
- Interpretation: This $68\%$ represents the proportion of the area under the curve.
- Probability Link: Areas under the curve can be interpreted as probabilities.
Properties of Areas:
- Total area under the curve is $1.0$ .
- Areas can be added or subtracted to find proportions in other regions.
- Example: Area outside $\text{z}=\pm 1.0$ is $1.0 - 0.6800 = 0.3200$ ( $32\%$ ) (Figure 3).
- Each tail beyond $\text{z}=\pm 1.0$ has $0.3200/2 = 0.1600$ ( $16\%$ ) of the area.
Standard Normal Distribution Table (z-table):
- Provides exact values for areas under the curve for specific z-scores.
- Format: Presents area in the body (left) for positive z-scores from $0.00-3.09$ .
- Finding Area: Locate row for main z-value, then column for hundredths place.
- Example (Figure 4, Area for $\text{z}=1.62$ ): Find row $1.60$ , column $0.02$ leads to $0.9474$ .
- Interpretation: $94.74\%$ chance of randomly selecting a z-score less than (to the left of) $1.62$ .
Leveraging Symmetry:
- Area in body for positive z is equal to area in body for negative z (just on the opposite side).
- Area in tail for positive z: $1.00 - \text{area in body for } z$ .
Example: Area More Extreme than $\text{z}=\pm 1.96$ (Figure 5)
- Body for $\text{z}=1.96$ is $0.9750$ .
- Tail for $\text{z}=1.96$ is $1.00 - 0.9750 = 0.0250$ .
- Tail for $\text{z}=-1.96$ is also $0.0250$ .
- Total area in both tails: $0.0250 + 0.0250 = 0.0500$ ( $5\%$ ).
- This $5\%$ region is crucial for Unit 2.
Area Between Two z-scores (Figure 6 - between $\text{z}=0.50$ and $\text{z}=1.50$ ):
- Find body of larger z ( $\text{z}=1.50 \to 0.9332$ ).
- Subtract body of smaller z ( $\text{z}=0.50 \to 0.6915$ ).
- Result: $0.9332 - 0.6915 = 0.2417$ .

Chapter 5: Probability

What is Probability?

Definition: How likely an event is to happen under specific conditions.
Conditional Nature: Probability changes as conditions change.
- Example: Probability of rain, given it's sunny (low) vs. cloudy/windy (high).
- "Given" states the conditions.
Event: A specific thing happening (e.g., rain, rolling a 1).
Precision: Use numbers instead of vague terms like "low" or "high."
Calculation Formula:
$\P(A) = \frac{\text{number of outcomes that count as A}}{\text{total number of possible outcomes}}$
Example: Rolling a Six-Sided Die
- Probability of rolling a 1:
- $1$ outcome (rolling a 1) satisfies criteria.
- $6$ total possible outcomes.
- $P(1) = 1/6 \approx 0.167$
- Probability of rolling an even number:
- $3$ outcomes (2, 4, 6) satisfy criteria.
- $P(\text{Even Number}) = 3/6 = 1/2 = 0.50$
Underlying Principles: These calculations assume fair (unbiased) conditions.

Probability in Graphs and Distributions

Probability in Pie Charts

Principle: The probability of randomly selecting an observation from data used to create a pie chart is equal to the proportion of that category's slice.
- All slices sum to $100\%$ or $1$ .
Example (Figure 1: Favorite Sports for $100$ people):
- Baseball: $36\%$ slice $\to P(\text{Baseball}) = 0.36$ .
Combining Categories: Probabilities of non-mutually exclusive events can be added.
- Example: Favorite sport usually played on grass (Baseball, Football, Soccer) $\to 36\% + 25\% + 20\% = 81\%$ .
- Example: Favorite sport not called football (Baseball, Hockey) $\to 36\% + 20\% = 56\%$ .
Reason: Slice size corresponds to area, percentages convert to decimals, and total area is $1.0$ .

Probability in Normal Distributions

Key Connection: Normal distributions have an area under the curve equal to $1$ , divisible into sections by z-scores. Thus, areas under the normal curve can be interpreted as probabilities.
Area between $\text{z}=-1.00$ and $\text{z}=1.00$ (Figure 2):
- Contains $68\%$ of the area.
- $P(-1.00 \le z \le 1.00) = 0.68$
Body and Tail (Figure 3): A line drawn at a z-score divides the distribution into a smaller "tail" and a larger "body."
- Differentiating depends on relative size: larger piece is always body.
Using the z-table (Standard Normal Distribution Table):
- Provides precise areas for bodies and tails of the standard normal distribution.
Area in Body: For $\text{z}=1.62$ (Figure 4), body to the left is $0.9474$ . So, P(z < 1.62) = 0.9474 .

Probability: The Bigger Picture

Core Idea: Probability of an event is the ratio of qualifying outcomes to total possible outcomes.
Graphical Extension: Larger regions in graphs (pie charts, normal distributions) correspond to higher probabilities.
Normal Distribution & Z-scores: Regions bounded by z-scores are directly linked to probabilities via the z-table.
Extremes: Tails of the distribution represent smaller regions, so the probability of finding extreme results (far from the mean) is small.
- This concept is fundamental to inferential statistics and hypothesis testing.

Chapter 6: Sampling Distributions

People, Samples, and Populations

Focus so far: Individual scores grouped into samples, drawn from typical populations.
Individual Score Variability: Individual scores differ from the mean (quantified by z-scores, variance, SD). This is natural.
Variability is Key: Measures of spread and the idea of variability are crucial in inferential statistics.
Sampling Error (Revisited): Just as individual scores differ from their mean, an individual sample mean will differ from the true population mean ( $\mu$ ).
- This is natural and expected.
- However, an extreme deviation might indicate something significant is occurring.

The Sampling Distribution of Sample Means

Concept: A theoretical distribution formed by taking many samples (all of the same size) from a population, calculating the mean of each sample, and then arranging these sample means into a new distribution.
Generic Term: This is a type of sampling distribution; can be formed from any statistic.
Characteristics (like any distribution):
- Shape: Normal (bell-shaped, single peak, symmetric tails).
- Center: The true population mean ( $\mu$ ). This is sometimes called $\mu_{\bar{X}}$ (mean of the sample means).
- Spread: The standard error ( $\sigma_{\bar{X}}$ ), which is the quantification of sampling error.
- Formula: $\sigma_{\bar{X}} = \sigma / \sqrt{n}$ (where $\sigma$ is population standard deviation, $n$ is sample size).
- Important: This refers to samples of a specific size $n$ . It's the sample size within each individual sample, not the number of samples used to form the theoretical distribution.
Visualization (Figure 1): Illustrates these principles.

Two Important Axioms

Theoretical Nature: Sampling distributions are theoretical and not directly observed.
Justification: Relies on two mathematical facts.

Central Limit Theorem (CLT)

Statement: For samples of a single size $n$ , drawn from a population with a given mean $\mu$ and variance $\sigma^2$ , the sampling distribution of sample means will have:
- A mean ( $\mu_{\bar{X}}$ ) equal to the population mean ( $\mu$ ).
- A variance ( $\sigma_{\bar{X}}^2$ ) equal to the population variance divided by $n$ : $\sigma^2 / n$ .
Standard Error: From the variance, standard error is $\sigma_{\bar{X}} = \sqrt{\sigma^2 / n} = \sigma / \sqrt{n}$ .
Normality Condition: The sampling distribution will approach normality as $n$ increases.
- Practical Rule: A sampling distribution will be normal if:
1. The population from which samples are drawn is normally distributed.
2. The sample size ( $n$ ) is equal to or greater than $30$ .
Significance: The second criterion allows use of normal distribution methods even if the true