Basic Statistics for the Behavioral Sciences - Chapter Three
Basic Statistics for the Behavioral Sciences - Chapter Three: Frequency Distributions and Percentiles
Introduction
Before analyzing the relationship between two variables, it is essential to summarize each variable independently.
Key Questions:
Which scores occurred in the data?
How often did each score occur?
This information is organized into tables and graphs using frequency distributions.
New Terms to Know
Frequency (f): The number of occurrences of a particular score in a dataset.
Distribution: A general term for any organized set of data.
Sample Size (N): Indicates the total number of scores in the dataset.
Simple Frequency Distributions
A simple frequency distribution displays the frequency of each score in a dataset.
Example: Count the number of times “Male” appears in class responses.
The symbol used for a score’s simple frequency is f.
Constructing a Simple Frequency Distribution Table
Typically involves:
Listing the highest and lowest scores.
Including all scores, with zeros for unpicked scores.
Important Note:
N is not simply the sum of scores but the count of individual data points (sum of frequencies, f).
Graphing a Simple Frequency Distribution
A frequency distribution graph presents scores on the X-axis and their frequencies on the Y-axis.
The type of measurement scale (nominal, ordinal, interval, or ratio) influences the graph type to use:
Bar Graph: Used for nominal and ordinal data (discrete categories, bars do not touch).
Histogram: Used for small ranges of interval or ratio scores (continuous data, bars touch).
Frequency Polygon: Used when dealing with large ranges of scores (points connected by lines).
Distribution Types
Various types of frequency distributions identified:
Normal Distribution:
Identified by a bell-shaped curve.
Symmetrical, with tails on either side containing low-frequency scores.
Skewed Distributions:
Negatively Skewed: More frequent low scores; the tail points leftward.
Positively Skewed: More frequent high scores; the tail points rightward.
Bimodal Distribution: A symmetrical distribution with two distinct frequency peaks (humps).
Rectangular Distribution: A symmetrical shape without tails, featuring uniform frequency across all values.
Relative Frequency and Normal Curve
Relative Frequency (rel.f): The proportion of occurrences of a score (odds in terms of the total).
The formula for calculating relative frequency:
Where:
f = frequency of the score
N = total number of scores
The area under the normal curve for a group of scores correlates to their combined relative frequency.
Cumulative Frequency and Percentiles
Cumulative Frequency (cf): The total frequency of all scores at or below a particular score.
To compute cumulative frequency:
Sum all frequencies for scores at and below the target score.
Percentile: The percentage of all scores that are at or below a given score.
For instance, being in the 90th percentile means scoring better than 90% of participants.
The formula to find a score's percentile is:
Grouped Frequency Distributions
When dealing with extensive datasets, it might be necessary to combine scores into small groups, thus creating a grouped frequency distribution.
This technique reports total frequencies, relative frequencies, or cumulative frequencies for each group, saving space.
Examples and SPSS Applications
Using datasets to find relative frequencies, cumulative frequencies, and percentiles are explained through examples.
SPSS: A statistical software program that can perform detailed analyses on datasets, similar to Excel but focused on statistics.
Scenario example: Analyzing college students' perceptions of friendship ease with gender considerations.
Analyzing family incomes of incoming freshmen to determine income distributions.
Example outputs can include:
Total sample size answering specific questions, percentages encountering gaps in data, and implications for findings based on demographic responses.