PSYC*1010 Week 2

Frequency Distributions

Types of Statistics

Descriptive Statistics
- Characterize attributes of samples and populations.
Inferential Statistics
- Generalize from a sample to an unknown population.

Importance of Frequency Distributions

Goal: Organize data to communicate the number of observations at each category on the measurement scale.
Data can be represented in either table or graph forms.

Types of Frequency Distributions

Three types:
1. Simple
2. Relative
3. Cumulative
Applications may vary based on the type of data:
- Numerical data
- Categorical data

Measurements in Frequency Distributions

Categorical Measurements:
- Nominal, Ordinal
Quantitative Measurements:
- Interval, Ratio

Simple Frequency Distribution

Example:
Quiz Score (X)
Frequency (f)
10
1
9
2
8
3
7
4
6
5
5
5
4
4
3
3
2
2
1
1
The scores should be arranged in ascending order.
Include frequencies even if they are 0.
The total frequencies must equal the sample size (f = n = 30).

Quiz Score (X)	Frequency (f)
10	1
9	2
8	3
7	4
6	5
5	5
4	4
3	3
2	2
1	1

Learning Check

Question: How many people are in this sample?
- Answer Options: a) 10 b) 15 c) 25 d) 32 e) Not enough info
- Total frequencies = 10
Question: Over 50% of individuals scored above 3. True/False

Relative Frequency Distribution

Each score is expressed as a proportion or percentage of the total sample.
New Column for Proportion (p):
- Formula: $p = \frac{f}{N}$
- All proportions should sum to 1.0.
New Column for Percent (%):
- Formula: $% = p \times 100$

Example Table

	Quiz Score (X)	Frequency (f)	Proportion (p)
10	1	$\frac{1}{30} = 0.03$	$0.03 \times 100 = 3$
9	2	$\frac{2}{30} = 0.07$	$0.07 \times 100 = 7$
8	3	$\frac{3}{30} = 0.10$	$0.10 \times 100 = 10$
…	…	…	…

Cumulative Frequency Distribution

Shows total frequencies (or proportions or percentages) at each value and all lower-ranked values.
Starting from the bottom, frequencies are added upwards to find cumulative frequencies (cf).
Cumulative Percentages (c%):
- Function: $c% = \left(\frac{cf}{N}\right) \times 100$

Example Table

Quiz Score (X)
Frequency (f)
Proportion (p)
Percent (%)
Cumulative Frequency (cf)
Cumulative Percent (c%)
10
1
…
…
1
…

9
2
…
…
3
…

…
…
…
…
…
…
Grouped Frequency Distribution
- When data spans a wide range, grouping into intervals can simplify presentation.
- Rules:
  - Use a consistent interval width (e.g., 5, 10, 15).
  - The starting point of each class should be a multiple of the interval width.
Example for Weight Distribution
- Grouping weights of 194 individuals:
  - Class intervals: 15 lbs wide
  - | Weight (X) | Frequency (f) |
    |----------------|---------------|
    | 255 − 269 | 1 |
    | 240 − 254 | 4 |
    | 225 − 239 | 2 |
    | 210 − 224 | 6 |
Categorical Frequency Distribution
- Arranges categories meaningfully and records frequencies.
- Types include simple frequency, relative frequency, cumulative frequency, and percentile ranks (if ordinal).
- Example of Primary Languages Spoken at Home:
  - | Language | Frequency (f) | Percent (%) |
    |---------------|----------------|--------------|
    | English | 81 | 45 |
    | French | 34 | 19 |
    | Mandarin | 22 | 12 |
Visualizing Distributions
- Choose visualization method based on scale of measurement and data type (discrete/continuous).
- Common types include:
  - Histograms: X-values on the x-axis, with bars representing frequencies.
  - Frequency Polygons: Data points are connected by lines.
Common Distribution Shapes
- Normal Distribution: Bell-shaped curve, one peak (unimodal), symmetrical.
- Bimodal Distribution: Two distinct peaks.
- Positively Skewed: Few high scores, most low scores. Common in variables like clinical depression.
- Negatively Skewed: Few low scores, most high scores, often in variables like life satisfaction.
Considerations for Data Visualizations
- Use accurate scales and avoid misleading representations.
- Key Tips:
  1. Know your audience.
  2. Identify the main message.
  3. Avoid "chartjunk" (unnecessary visual features).
  4. Make sure to label axes clearly and include legends where necessary.
  5. Ensure that color choices are accessible and informative.
Misleading Visualizations
- Watch for: Misleading scales on axes, contradictory presentations of information, and geometry misrepresentations.
Conclusion and Further Reading
- Strong data visualizations are essential for clarity and effectiveness in communicating research findings.
- Always critically evaluate visual data presentations for integrity and clarity.

Frequency Distributions

Types of Statistics

Descriptive Statistics
- Characterize attributes of samples and populations.
Inferential Statistics
- Generalize from a sample to an unknown population.

Importance of Frequency Distributions

Definition: An organized tabulation showing the number of individuals ( $f$ ) in each category on the measurement scale.
Goal: Organize raw scores into patterns (high/low, clustered/spread) to simplify communication.
Purpose: Allows researchers to see data "at a glance" (e.g., identifying that most students scored $8$ or $9$ on a quiz despite few perfect scores).
Psychology Application: Organizing study scores (e.g., anxiety ratings) to spot trends before conducting inferential statistics.

Types of Frequency Distributions

Three main types:
1. Simple: Raw counts ( $f$ ) per score ( $X$ ).
2. Relative: Proportions or percentages ( $p = \frac{f}{N}$ , $\% = p \times 100$ ).
3. Cumulative: Running totals ( $cf$ or $c\%$ ).
Applications vary based on data type:
- Numerical data: Ordinal, Interval, Ratio.
- Categorical data: Nominal, Ordinal.

Measurements in Frequency Distributions

Categorical Measurements: Nominal, Ordinal
Quantitative Measurements: Interval, Ratio

Simple Frequency Distribution

Rules for Construction:
- $X$ Column: Highest to lowest (though software may use ascending).
- Include all values: All scores in the range must be listed, even if $f = 0$ .
- Total Frequencies: Sum of frequencies must equal the sample size ( $\sum f = N$ ).
Calculations from Tables:
- Sum of Scores ( $\sum X$ ): Calculated as $\sum (X \times f)$ .
- Example: If $X=5, f=1$ and $X=4, f=2$ , then $\sum X = (5 \times 1) + (4 \times 2) = 13$ .
Example (Quiz Scores, $N=20$ ):

$X$	$f$
10	2
9	5
8	7
7	3
6	2
5	0
4	1
Total	$N=20$

Learning Check

Question: How many people are in this sample?
- Answer: Sum the frequencies ( $\sum f$ ).
Question: Raw scores: $10, 10, 9, 9, 9, 9, 9, 8, 8$ . What is $f$ for $X=9$ ?
- Answer: $5$ .
Question: Over $50\%$ of individuals scored above $3$ . True/False

Relative Frequency Distribution

Expresses each score as a proportion ( $p$ ) or percentage ( $\%$ ) of the total sample.
Psychology Use: Stating " $30\%$ of the sample is clinically depressed."
Proportion ( $p$ ):
- Formula: $p = \frac{f}{N}$
- All proportions must sum to $1.0$ .
Percentage (\%):
- Formula: $\% = p \times 100$

Example Table ( $N=10$ )

$X$	$f$	$p$	$\%$
5	1	0.10	10
4	2	0.20	20
3	3	0.30	30
2	3	0.30	30
1	1	0.10	10

Cumulative Frequency Distribution

Cumulative Frequency ( $cf$ ): Shows the number of observations at or below a specific score. Frequencies are added starting from the bottom ( $f<em>{low}$ to $f</em>{high}$ ).
Cumulative Percentages ( $c\%$ ): Also known as Percentile Rank.
- Formula: $c\% = \left(\frac{cf}{N}\right) \times 100$
Percentiles: To find the $95$ th percentile, scan the $c\%$ column for the first value $\geq95$ .

Grouped Frequency Distribution

When to Use: When data spans a wide range (typically more than $20$ rows).
General Guidelines:
- Use approximately $10$ intervals.
- Choose a simple interval width ( $2, 5, 10, 15$ ).
- The bottom score of each interval should be a multiple of the width.
- Intervals must be equal in width with no gaps or overlaps.
Real Limits: For continuous variables, intervals have real limits (e.g., an apparent interval of $90-94$ has real limits of $89.5-94.5$ ).
Trade-off: Grouping results in information loss because exact scores are no longer visible.

Categorical Frequency Distribution

Arranges non-numerical categories (Nominal/Ordinal) and records frequencies.
Order: Can be arbitrary for nominal data, but should be meaningful for ordinal data (e.g., Gold, Silver, Bronze).
Example (Primary Languages):
- English: $f=81$ ( $45\%$ )
- French: $f=34$ ( $19\%$ )
- Mandarin: $f=22$ ( $12\%$ )

Visualizing Distributions

Histograms: Used for numerical/continuous data. Bars represent frequency/proportion and should touch (no gaps).
Bar Graphs: Used for categorical data. Gaps are placed between bars to show distinct categories.
Frequency Polygons: Data points are placed above midpoints and connected by lines; the ends are "anchored" to the x-axis at zero frequency.

Common Distribution Shapes and Psychology Examples

Normal Distribution: Bell-shaped, unimodal, and symmetrical.
- Example: IQ scores.
Bimodal Distribution: Two distinct peaks.
- Example: Heights of a group containing both men and women.
Positively Skewed: Tail points to the right (few high scores, many low scores).
- Example: Clinical depression scores in a general population.
Negatively Skewed: Tail points to the left (few low scores, many high scores).
- Example: Life satisfaction scores in stable environments.

Data Visualization Best Practices

10 Rules for Effectiveness:
1. Know your audience.
2. Identify the main message.
3. Avoid "chartjunk" (unnecessary decorative features).
4. Optimize the data-ink ratio (focus on the data).
5. Label axes clearly and include legends.
6. Ensure color choices are accessible (e.g., colorblind-friendly).
7. Use error bars where appropriate.
8. Choose the right graph type (Histogram for distributions, Scatter for patterns, Line for trends).
9. Message should take priority over beauty.
10. Critically evaluate for integrity.

Misleading Visualizations

Watch for:
- Misleading Scales: Starting the y-axis at a value other than zero to exaggerate differences.
- Dual Axes: Can imply correlations that don't exist.
- Geometry Misrepresentations: Using 3D effects or distorted areas (e.g., in pie charts) that make segments look larger or smaller than they are.

PSYC*1010 Week 2

Frequency Distributions

Types of Statistics

Importance of Frequency Distributions

Types of Frequency Distributions

Measurements in Frequency Distributions

Simple Frequency Distribution

Learning Check

Relative Frequency Distribution

Example Table

Cumulative Frequency Distribution

Example Table

Grouped Frequency Distribution

Example for Weight Distribution

Categorical Frequency Distribution

Visualizing Distributions

Common Distribution Shapes

Considerations for Data Visualizations

Misleading Visualizations

Conclusion and Further Reading

Frequency Distributions

Types of Statistics

Importance of Frequency Distributions

Types of Frequency Distributions

Measurements in Frequency Distributions

Simple Frequency Distribution

Learning Check

Relative Frequency Distribution

Cumulative Frequency Distribution

Grouped Frequency Distribution

Categorical Frequency Distribution

Visualizing Distributions

Data Visualization Best Practices

Misleading Visualizations