1/51
Flashcards covering fundamental statistical concepts, data types, levels of measurement, frequency distributions, graphical representations, measures of center and variation, and basic probability concepts.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is 'data' in statistics?
They’re observations, such as measurements, genders, or survey responses, that have been collected.
What is 'statistics' as a field of study?
It’s a collection of methods for planning experiments, obtaining data, and then organizing, summarizing, analyzing, interpreting, presenting, and drawing conclusions based on the data.
Define 'population' in statistics.
is the complete collection of all elements (scores, measurements, people, etc.) to be studied, including all subjects.
What is a 'census'?
is the collection of data from every member of the population.
What is a 'sample' in statistics?
is a subcollection of members selected from part of the population.
What is the three-step process of a statistical study?
involves preparing the data, analyzing the data, and drawing conclusions.
What is a 'parameter'?
is a measurement describing some characteristic of a population; it is the result obtained from a census.
What is a 'statistic' (value)?
is a measurement describing some characteristic of a sample; it is the result obtained from a sample.
What are the two main types of data?
Quantitative data and Qualitative (Categorical) data.
Describe 'quantitative data'.
consists of numerical values representing counts or measurements (e.g., age in years, height).
Explain the difference between 'discrete data' and 'continuous data'.
are quantitative, countable values (finite or countably infinite), typically whole numbers (e.g., number of students). Continuous data are quantitative, results from infinitely many possible values from measurements, and are not countable individually (e.g., lengths, weights, temperatures).
Describe 'qualitative data', also known as 'categorical data'.
consists of labels or names that represent attributes, not numbers representing counts or measurements (e.g., colors, grades as labels, identification numbers).
What is the 'Nominal level of measurement'?
is characterized by names, labels, and categories only, which cannot be ordered (e.g., colors, brands of medication).
What is the 'Ordinal level of measurement'?
involves categorical data that can be ordered, but the differences between values are not meaningful or cannot be determined (e.g., course grades A, B, C; satisfaction levels).
What is the 'Interval level of measurement'?
involves quantitative data that can be arranged in order, and differences between values are meaningful, but there is no true zero representing a complete absence of the quantity (e.g., daily temperatures in °F or °C, years).
What is the 'Ratio level of measurement'?
involves quantitative data that can be arranged in order, differences are meaningful, there is a true zero (meaning none of the quantity exists), and ratios are also meaningful (e.g., heights of students, class times, weights).
What is a 'Frequency Distribution' or 'Frequency Table'?
shows how data are partitioned among several categories or classes by listing the categories along with the number (frequency) of data values in each of them.
Define 'frequency' in a frequency distribution.
The frequency of a particular class is the number of values that fall into that class.
What are 'lower class limits' ?
are the smallest numbers that can belong to each of the different classes.
What are 'upper class limits'?
are the largest numbers that can belong to each of the different classes.
What are 'class boundaries'?
are the numbers used to separate the classes but without the gaps created by class limits (e.g., 49.5-69.5, 69.5-89.5).
How are 'class midpoints' calculated?
are the values in the middle of the classes; they are calculated by (Lower Class Limit + Upper Class Limit) / 2.
What is 'class width'?
is the difference between two consecutive lower class limits or two consecutive lower class boundaries in a frequency distribution.
How is 'relative frequency' calculated for a class?
is calculated by dividing the Frequency for that class by the Sum of all frequencies.
What is a 'Cumulative Frequency Distribution'?
is a variation where the frequency for each class is the sum of frequencies for that class and all previous classes, showing the total number of values below a certain limit.
What is a 'histogram'?
is a graph consisting of bars of equal width drawn adjacent to each other (unless there are gaps in the data), where the horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies.
What are the four important uses of a histogram?
A histogram visually displays the shape of the data distribution, shows the location of the center of the data, shows the spread of the data, and identifies outliers.
What characterizes a 'Normal Distribution' when depicted by a histogram?
A normal distribution appears bell-shaped in a histogram, meaning most values are near the center, fewer values are at the extremes, and it is symmetrical.
What characterizes a 'Uniform Distribution' when depicted by a histogram?
A uniform distribution occurs when different possible values appear with roughly the same frequency, resulting in bars of approximately equal height in a histogram.
What does it mean for a distribution of data to be 'skewed'?
A distribution of data is skewed if it is not symmetric and extends to one side more than the other.
Describe a 'Right-Skewed' (Positively Skewed) distribution.
In a right-skewed distribution, the right tail is longer, meaning there are more exceptionally large values (e.g., annual incomes).
Describe a 'Left-Skewed' (Negatively Skewed) distribution.
In a left-skewed distribution, the tail extends to the left, meaning there are more exceptionally small values (e.g., human life span).
What is a 'dotplot'?
A dotplot is a graph of quantitative data in which each data value is plotted as a point (or dot) above a horizontal scale of values.
What is a 'stemplot' (or 'stem-and-leaf plot')?
A stemplot represents quantitative data by separating each value into two parts: a stem (e.g., the leftmost digits) and a leaf (e.g., the rightmost digit).
What is a 'time-series graph'?
A time-series graph is a graph of time-series data, which are quantitative data that have been collected at different times, such as monthly or yearly, revealing trends over time.
What is a 'bar graph'?
A bar graph uses bars of equal width to show frequencies of categories of categorical (or qualitative) data, making it easy to compare different categories.
What is a 'pie chart'?
A pie chart depicts categorical data as slices of a circle, where the size of each slice is proportional to the frequency count for the category.
What is a 'frequency polygon'?
A frequency polygon uses line segments connected to points located directly above class midpoint values, similar to a histogram but using lines instead of bars.
What is a 'measure of center'?
A measure of center is a value at the center or middle of a data set.
Define 'mean' (or average) in statistics.
The mean is a measure of center found by adding all of the data values in a set and dividing the total by the number of data values.
Is the mean resistant to outliers?
No, the mean is not resistant to outliers; one extreme value can noticeably change its value.
Define 'median' in statistics.
The median is the middle value in a data set when the original data values are arranged in order of increasing (or decreasing) magnitude.
Is the median resistant to outliers?
Yes, the median is resistant to outliers; it is not significantly affected by a few extreme values.
Define 'mode' in statistics.
The mode is the value that appears with the greatest frequency in a data set.
When is the mode particularly useful?
The mode is the only measure of center that can be used with qualitative (categorical) data.
How is a 'weighted mean' calculated?
A weighted mean is calculated by summing the products of each data value (x) and its corresponding weight (w), then dividing by the sum of all weights (Σ(w·x) / Σw).
What is 'variance' in statistics?
Variance is a measure of variation which is equal to the square of the standard deviation (s² for a sample, σ² for a population).
What is 'standard deviation'?
Standard deviation is a measure of variation that describes the typical distance of any point from the mean; it is the square root of the variance (s for a sample, σ for a population).
What is the 'Coefficient of Variation (CV)' and when is it useful?
The Coefficient of Variation (CV) describes the standard deviation relative to the mean, expressed as a percentage. It is useful for comparing variation in data sets with different units or vastly different means.
What is an 'event' in probability?
An event is any collection of results or outcomes of a procedure.
What is a 'simple event' in probability?
A simple event is an outcome or an event that cannot be broken down into simpler parts; it is a single result (e.g., getting a 'head' when tossing a coin).
What is 'sample space' in probability?
Sample space consists of all possible simple events, meaning all outcomes that cannot be broken down any further.