Freq distributions and central tendency
Frequency Distributions and Histograms
Frequency Distribution: A simplified representation of data showing how often each value occurs.
Created from raw data.
Comprised of bins, representing ranges of values.
Shows observations within each bin and the percentage of total data that each bin represents.
Histogram: A graphical depiction of frequency distribution.
X-axis: Represents the bins (height ranges).
Y-axis: Represents the percentage of data within each bin.
The area of the bars corresponds to the proportion of data in each category.
Characteristics of Distributions
The shape of a distribution reveals important information about the data.
Skewed Distributions: Distributions with tails extending towards one end.
Positively Skewed: Tail points towards higher numbers.
Negatively Skewed: Tail points towards lower numbers.
Normal Distribution: Symmetrical distribution where the data points are equally spread around a central value.
50% of data points are on each side of the midpoint.
Measures of Central Tendency
Mean, Median, Mode: Key measures that summarize data sets by indicating central value.
Mode: The most frequent value in the data set.
Located at the apex of the distribution.
Mean: Average value, influenced by outliers.
In positively skewed distributions, the mean is pulled towards the tail, lying to the right of the mode.
In negatively skewed distributions, the tail points left, pulling the mean to the left of the mode.
Median: The middle value when data is ordered.
Falls between the mean and mode in both skewed distributions.
Relationship in Skewed Distributions
Positively Skewed:
Order: Mode < Median < Mean
Negatively Skewed:
Order: Mean < Median < Mode
Normal Distribution:
Mean = Median = Mode (All at the center).
Recap of Key Points
For positively skewed distributions, the mean is the highest value, followed by the median, and then the mode.
For negatively skewed distributions, the mean is lowest, followed by the median, and then the mode.
In normal distributions, all three measures (mean, median, mode) are equal, emphasizing the symmetry of the data.