Section 2: Exploring Data with Tables and Graphs
Frequency Distributions and Histograms
A frequency distribution summarizes data and investigates its distribution.
Histogram:
A graphical representation consisting of bars of equal width.
The horizontal scale represents classes of quantitative data values.
The vertical scale shows frequencies corresponding to bars' heights.
Important Uses of a Histogram
Visual Representation: Displays the shape of the data distribution.
Center Identification: Indicates the location of the data's center.
Spread Analysis: Reflects the variability or spread of the data.
Outlier Detection: Helps in identifying outliers in the data.
Relative Frequency Histogram
Defined as:
A variant of a histogram with the same shape and horizontal scale.
The vertical scale shows relative frequencies instead of actual frequencies.
Critical Thinking: Interpreting Histograms
Analysis of histograms for insights:
Determine the Center of the data.
Assess the Variation in the dataset.
Identify the shape of the Distribution.
Look for any Outliers present in the data.
Consider the aspect of Time as it relates to the data.
Common Distribution Shapes
Bell-Shaped (Normal) Distribution:
Represented by a histogram that is symmetrically bell-shaped.
Uniform Distribution:
Features a flat histogram where data values are evenly distributed.
Skewness:
Skewed to the Right: Data has a longer tail on the right side (positive skew).
Skewed to the Left: Data extends more to the left side (negative skew).
Normal Distribution
Characteristics of normal distribution:
Data follows a bell-shaped curve.
Indicates that most observations cluster around the central peak.
Skewness
Definition:
A distribution is skewed if it is not symmetric and stretches more to one side.
Types of skewness:
Right (Positive): Longer right tail.
Left (Negative): Longer left tail.
Assessing Normality with Normal Quantile Plots
Criteria for Normal Distribution:
Points should align closely around a straight line.
Absence of systematic patterns that deviate from a straight line.
Indicators of Non-Normal Distribution:
Points do not lie close to a straight-line pattern.
Presence of systematic patterns that deviate from linearity.