1/27
Flashcards covering core concepts from the lecture notes on descriptive statistics, including charts, tables, and interpretations of various graphical methods.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is Descriptive Statistics?
A set of methods that summarize and describe the properties of a data set, providing a basic step for understanding data before detailed analysis.
Name some graphical methods commonly used for descriptive statistics.
Frequency distribution tables, bar charts, pie charts, box plots, stem-and-leaf plots, histograms, scatter plots, and parallel coordinates plots.
What are the numerical measures of central tendency?
Mean, median, and mode.
What are the measures of dispersion?
Range, interquartile range (IQR), variance, standard deviation, and coefficient of variation.
What is a frequency distribution?
A table that groups data by category or class and shows the frequency (count) for each category or class.
In a frequency distribution for categorical data, what are the two main columns?
Categories (values) in the first column and their frequencies in the second column.
What is a class interval?
A numeric data range divided into equal-width intervals used to organize numeric data.
What is a class mark?
The representative value of a class interval, usually the interval midpoint.
What is relative frequency?
Frequency divided by the total number of observations.
What is cumulative frequency?
The running total of frequencies up to and including a given class.
What is cumulative relative frequency?
Cumulative frequency divided by the total number of observations.
What is a histogram?
A graph of numeric data using a frequency distribution with adjacent bars for continuous data.
What is the difference between a histogram and a bar chart?
Histograms are for numeric data with continuous intervals and bars that touch; bar charts are for categorical data with separated bars.
What does skewness indicate in a histogram?
The direction of the tail: right (positive skew, tail to the right) or left (negative skew, tail to the left).
What is a stem-and-leaf plot?
A data display that preserves original values by splitting numbers into stems and leaves and ordering the leaves.
What is an ogive?
A cumulative frequency curve, representing cumulative frequencies against the data values.
What is a five-number summary?
Minimum, first quartile (Q1), median, third quartile (Q3), and maximum.
What is a box plot?
A graphical representation of the five-number summary: a box from Q1 to Q3, a line for the median, whiskers to min and max (often within 1.5 IQR), and outliers.
What is an outlier?
An observation that is unusually far from the rest of the data; may indicate data entry or measurement error or a true extreme value and requires careful handling.
What is a time series plot?
A plot of observations over time with time on the x-axis and the variable on the y-axis.
What is a scatter plot used for?
Displays the relationship between two numeric variables and indicates linearity and direction (positive or negative) of the relationship.
What is a scatter plot matrix?
A grid showing all pairwise scatter plots for three or more variables, used to explore relationships.
What is a parallel coordinates plot?
A multivariate plot where each variable has a parallel axis; data points are connected across axes to reveal patterns or clusters.
What is a mosaic plot?
A graphical display for contingency tables using rectangular areas proportional to category frequencies, allowing comparisons across multiple categorical variables.
What is a contingency table?
A cross-tabulation of two or more categorical variables showing their joint distribution.
What is aspect ratio in graphs?
The ratio of the x-axis length to the y-axis length; affects how the graph’s shape and relationships are visually perceived.
Give an example of a class interval for body weight data.
50–60 kg, 60–70 kg, 70–80 kg, etc.
Why is data visualization important?
It helps quickly understand data characteristics, facilitates initial exploration, and supports informed decision-making.