Exploring Data with Tables and Graphs
Focus on organizing and summarizing data through various methods including graphs and frequency distributions.
A frequency distribution (or frequency table) displays how data values are distributed among categories.
It aids in understanding the distribution's nature in large data sets.
Graphical representation of frequency distributions using bars to illustrate the frequency of data in specific intervals.
Importance of visual representation and potential for misleading interpretations.
Understanding how to discern accurate data representations.
Visual methods for understanding relationships between two quantitative variables.
Correlation indicates the strength and direction of a relationship; regression provides a way to model the relationship.
Frequency Distribution (Frequency Table):
Organizes data into classes, indicating the frequency of data points in each category.
Useful for summarizing large data sets to observe patterns and trends.
Lower Class Limits: Smallest numbers in each class.
Upper Class Limits: Largest numbers in each class.
Class Boundaries: Separates classes without gaps, used for accurate representation.
Class Midpoints: Calculated as (Lower Limit + Upper Limit) / 2.
Class Width: Difference between two consecutive lower class limits.
Procedure (Part 1):
Select the number of classes (typically between 5 and 20).
Calculate class width: (Maximum Data Value - Minimum Data Value) / Number of Classes.
Round up for convenience.
Procedure (Part 2):
Choose the first lower class limit (minimum value or convenient value).
List subsequent lower limits using the class width.
Tally data values into respective classes, summing to find frequencies.
Based on data from Los Angeles' daily commute times.
Data points to consider for frequency distribution.
Select 7 classes and calculate class width rounded to 15 for convenience.
Establish lower class limits: 0, 15, 30, 45, 60, 75, 90.
Corresponding upper class limits identified: 14, 29, 44, 59, 74, 89, 104.
Document tally marks for each class to find frequencies.
Frequencies recorded: 0-14 (6), 15-29 (18), 30-44 (14), 45-59 (5), 60-74 (5), 75-89 (1), 90-104 (1).
Relative Frequency Distribution:
Involves expressing class frequencies as a total proportion of sum of all frequencies.
Important for understanding data in percentage terms.
The total must approximate to 100% for accuracy, accommodating rounding errors.
Example frequencies for commute time in Los Angeles: 0-14 (12%), 15-29 (36%), etc.
Combining relative frequency distributions for different data sets facilitates comparisons of trends.
Displays relative frequencies from both locations.
Helps highlight differences in commute times influenced by city size and density.
Notable differences in commute times between cities; Boise shows lower times.
Cumulative frequency sums frequencies of each class with all previous classes.
Example data for commute times outlined, showing cumulative sums.
Normal distributions typically show a pattern of increasing frequencies to a peak followed by a decrease.
Normal distribution representation example provided, illustrating expected frequency trends.
Gaps can suggest data derived from different populations, although this is not universally true.
Frequency distribution of penny weights highlights gaps, indicating possibly distinct populations based on composition.
Analyzes the significant weight gap related to the different compositions of pennies pre- and post-1983.