PSYC*1010 Week 2
Frequency Distributions
Types of Statistics
Descriptive Statistics
Characterize attributes of samples and populations.
Inferential Statistics
Generalize from a sample to an unknown population.
Importance of Frequency Distributions
Goal: Organize data to communicate the number of observations at each category on the measurement scale.
Data can be represented in either table or graph forms.
Types of Frequency Distributions
Three types:
Simple
Relative
Cumulative
Applications may vary based on the type of data:
Numerical data
Categorical data
Measurements in Frequency Distributions
Categorical Measurements:
Nominal, Ordinal
Quantitative Measurements:
Interval, Ratio
Simple Frequency Distribution
Example:
Quiz Score (X)
Frequency (f)
10
1
9
2
8
3
7
4
6
5
5
5
4
4
3
3
2
2
1
1
The scores should be arranged in ascending order.
Include frequencies even if they are 0.
The total frequencies must equal the sample size (f = n = 30).
Learning Check
Question: How many people are in this sample?
Answer Options: a) 10 b) 15 c) 25 d) 32 e) Not enough info
Total frequencies = 10
Question: Over 50% of individuals scored above 3. True/False
Relative Frequency Distribution
Each score is expressed as a proportion or percentage of the total sample.
New Column for Proportion (p):
Formula:
All proportions should sum to 1.0.
New Column for Percent (%):
Formula:
Example Table
Quiz Score (X)
Frequency (f)
Proportion (p)
Percent (%)
10
1
9
2
8
3
…
…
…
…
Cumulative Frequency Distribution
Shows total frequencies (or proportions or percentages) at each value and all lower-ranked values.
Starting from the bottom, frequencies are added upwards to find cumulative frequencies (cf).
Cumulative Percentages (c%):
Function:
Example Table
Quiz Score (X)
Frequency (f)
Proportion (p)
Percent (%)
Cumulative Frequency (cf)
Cumulative Percent (c%)
10
1
…
…
1
…
9
2
…
…
3
…
…
…
…
…
…
…
Grouped Frequency Distribution
When data spans a wide range, grouping into intervals can simplify presentation.
Rules:
Use a consistent interval width (e.g., 5, 10, 15).
The starting point of each class should be a multiple of the interval width.
Example for Weight Distribution
Grouping weights of 194 individuals:
Class intervals: 15 lbs wide
| Weight (X) | Frequency (f) |
|----------------|---------------|
| 255 − 269 | 1 |
| 240 − 254 | 4 |
| 225 − 239 | 2 |
| 210 − 224 | 6 |
Categorical Frequency Distribution
Arranges categories meaningfully and records frequencies.
Types include simple frequency, relative frequency, cumulative frequency, and percentile ranks (if ordinal).
Example of Primary Languages Spoken at Home:
| Language | Frequency (f) | Percent (%) |
|---------------|----------------|--------------|
| English | 81 | 45 |
| French | 34 | 19 |
| Mandarin | 22 | 12 |
Visualizing Distributions
Choose visualization method based on scale of measurement and data type (discrete/continuous).
Common types include:
Histograms: X-values on the x-axis, with bars representing frequencies.
Frequency Polygons: Data points are connected by lines.
Common Distribution Shapes
Normal Distribution: Bell-shaped curve, one peak (unimodal), symmetrical.
Bimodal Distribution: Two distinct peaks.
Positively Skewed: Few high scores, most low scores. Common in variables like clinical depression.
Negatively Skewed: Few low scores, most high scores, often in variables like life satisfaction.
Considerations for Data Visualizations
Use accurate scales and avoid misleading representations.
Key Tips:
Know your audience.
Identify the main message.
Avoid "chartjunk" (unnecessary visual features).
Make sure to label axes clearly and include legends where necessary.
Ensure that color choices are accessible and informative.
Misleading Visualizations
Watch for: Misleading scales on axes, contradictory presentations of information, and geometry misrepresentations.
Conclusion and Further Reading
Strong data visualizations are essential for clarity and effectiveness in communicating research findings.
Always critically evaluate visual data presentations for integrity and clarity.
Frequency Distributions
Types of Statistics
Descriptive Statistics
Characterize attributes of samples and populations.
Inferential Statistics
Generalize from a sample to an unknown population.
Importance of Frequency Distributions
Definition: An organized tabulation showing the number of individuals () in each category on the measurement scale.
Goal: Organize raw scores into patterns (high/low, clustered/spread) to simplify communication.
Purpose: Allows researchers to see data "at a glance" (e.g., identifying that most students scored or on a quiz despite few perfect scores).
Psychology Application: Organizing study scores (e.g., anxiety ratings) to spot trends before conducting inferential statistics.
Types of Frequency Distributions
Three main types:
Simple: Raw counts () per score ().
Relative: Proportions or percentages (, ).
Cumulative: Running totals ( or ).
Applications vary based on data type:
Numerical data: Ordinal, Interval, Ratio.
Categorical data: Nominal, Ordinal.
Measurements in Frequency Distributions
Categorical Measurements: Nominal, Ordinal
Quantitative Measurements: Interval, Ratio
Simple Frequency Distribution
Rules for Construction:
Column: Highest to lowest (though software may use ascending).
Include all values: All scores in the range must be listed, even if .
Total Frequencies: Sum of frequencies must equal the sample size ().
Calculations from Tables:
Sum of Scores (): Calculated as .
Example: If and , then .
Example (Quiz Scores, ):
10 | 2 |
9 | 5 |
8 | 7 |
7 | 3 |
6 | 2 |
5 | 0 |
4 | 1 |
Total |
Learning Check
Question: How many people are in this sample?
Answer: Sum the frequencies ().
Question: Raw scores: . What is for ?
Answer: .
Question: Over of individuals scored above . True/False
Relative Frequency Distribution
Expresses each score as a proportion () or percentage () of the total sample.
Psychology Use: Stating " of the sample is clinically depressed."
Proportion ():
Formula:
All proportions must sum to .
Percentage (\%):
Formula:
Example Table ()
5 | 1 | 0.10 | 10 |
4 | 2 | 0.20 | 20 |
3 | 3 | 0.30 | 30 |
2 | 3 | 0.30 | 30 |
1 | 1 | 0.10 | 10 |
Cumulative Frequency Distribution
Cumulative Frequency (): Shows the number of observations at or below a specific score. Frequencies are added starting from the bottom ( to ).
Cumulative Percentages (): Also known as Percentile Rank.
Formula:
Percentiles: To find the th percentile, scan the column for the first value .
Grouped Frequency Distribution
When to Use: When data spans a wide range (typically more than rows).
General Guidelines:
Use approximately intervals.
Choose a simple interval width ().
The bottom score of each interval should be a multiple of the width.
Intervals must be equal in width with no gaps or overlaps.
Real Limits: For continuous variables, intervals have real limits (e.g., an apparent interval of has real limits of ).
Trade-off: Grouping results in information loss because exact scores are no longer visible.
Categorical Frequency Distribution
Arranges non-numerical categories (Nominal/Ordinal) and records frequencies.
Order: Can be arbitrary for nominal data, but should be meaningful for ordinal data (e.g., Gold, Silver, Bronze).
Example (Primary Languages):
English: ()
French: ()
Mandarin: ()
Visualizing Distributions
Histograms: Used for numerical/continuous data. Bars represent frequency/proportion and should touch (no gaps).
Bar Graphs: Used for categorical data. Gaps are placed between bars to show distinct categories.
Frequency Polygons: Data points are placed above midpoints and connected by lines; the ends are "anchored" to the x-axis at zero frequency.
Common Distribution Shapes and Psychology Examples
Normal Distribution: Bell-shaped, unimodal, and symmetrical.
Example: IQ scores.
Bimodal Distribution: Two distinct peaks.
Example: Heights of a group containing both men and women.
Positively Skewed: Tail points to the right (few high scores, many low scores).
Example: Clinical depression scores in a general population.
Negatively Skewed: Tail points to the left (few low scores, many high scores).
Example: Life satisfaction scores in stable environments.
Data Visualization Best Practices
10 Rules for Effectiveness:
Know your audience.
Identify the main message.
Avoid "chartjunk" (unnecessary decorative features).
Optimize the data-ink ratio (focus on the data).
Label axes clearly and include legends.
Ensure color choices are accessible (e.g., colorblind-friendly).
Use error bars where appropriate.
Choose the right graph type (Histogram for distributions, Scatter for patterns, Line for trends).
Message should take priority over beauty.
Critically evaluate for integrity.
Misleading Visualizations
Watch for:
Misleading Scales: Starting the y-axis at a value other than zero to exaggerate differences.
Dual Axes: Can imply correlations that don't exist.
Geometry Misrepresentations: Using 3D effects or distorted areas (e.g., in pie charts) that make segments look larger or smaller than they are.