1/159
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Measures of Central Tendency
Values that are typical or representative of a data set; they tend to lie near the center of a distribution when data are arranged by magnitude. The three most common are the mean, median, and mode.
Arithmetic Mean
The most commonly used measure of central tendency; calculated by adding all scores in a distribution and dividing by the total number of scores. Formula: Sum of all scores ÷ number of scores (N).
Weighted Mean
A mean in which certain scores are assigned a weight (W) based on their importance. Formula: Sum of (each score × its weight) ÷ Sum of all weights.
Mean from Grouped Data
When data are grouped into class intervals, the midpoint of each interval represents all scores in that interval; the mean is computed by multiplying each midpoint by its frequency, summing those products, and dividing by total frequency.
Ungrouped Data
Individual information on each member of a population or sample; raw scores are listed separately.
Grouped Data
Data in which individual scores are aggregated within class intervals in a frequency distribution. All values within an interval are treated as equal to the interval's midpoint.
Median
The middle score of a distribution when scores are arranged in order of magnitude. If N is odd, it is the middle score; if N is even, it is the average of the two middle scores. Not influenced by outliers.
Mode
The score that appears most frequently in a distribution. In a skewed distribution, the mode shifts toward the high point of the curve (away from the tail).
Unimodal Distribution
A frequency distribution with only one mode (one peak).
Bilaterally Symmetrical Curve
A curve that, when folded vertically down the center, is identical on both sides. In this type of distribution, the mean, median, and mode are all equal and lie at the center.
Normal Bell-Shaped Curve
A bilaterally symmetrical, unimodal curve considered the mathematical ideal. Also called a normal distribution or normal curve. The distribution is fully defined by its mean and standard deviation.
Skewed Curve
A curve in which scores are concentrated at either the high or low end, producing a tail. The direction of skewness refers to the location of the tail, not the peak.
Positive Skewness
A distribution skewed to the RIGHT — the tail extends toward the right. The mean is pulled to the right (raised). The mode is to the left, and the median falls between the mean and mode.
Negative Skewness
A distribution skewed to the LEFT — the tail extends toward the left. The mean is pulled to the left (lowered). The mode is to the right, and the median falls between the mean and mode.
Effect of Skewness on the Mean
The mean is shifted in the direction of the tail: right for positive skewness, left for negative skewness. It is the measure most affected by outliers.
Effect of Skewness on the Mode
The mode shifts toward the peak of the curve, AWAY from the tail — right with negative skewness, left with positive skewness.
Effect of Skewness on the Median
The median always lies between the mean and the mode in a skewed distribution. It is the preferred measure of central tendency for skewed data.
Outlier
An extreme score that is far from the rest of the data. Outliers inflate (or deflate) the mean and the range, making both less representative.
Trimmed Mean
A mean calculated after excluding extreme scores (outliers); should be reported with a notation explaining the exclusion.
J-Shaped Curve
A frequency distribution curve shaped like the letter J; scores are concentrated heavily at one end, declining toward the other.
Reverse J-Shaped Curve
A frequency distribution curve shaped like a reverse J; the mirror image of a J-shaped curve, with scores concentrated at the opposite end.
Bimodal Curve
A frequency distribution curve with two peaks (two modes), indicating two frequently occurring scores or groups in the data.
Multimodal Curve
A frequency distribution curve with more than two peaks, indicating more than two frequently occurring scores.
Measures of Dispersion (Variability)
Statistical measures that describe the spread or variation of scores in a data set. The three main measures are range, variance, and standard deviation. Also called measures of spread or variation.
Range
The simplest measure of dispersion; calculated as the highest score minus the lowest score. Only uses two values, so it is influenced by outliers and is not a reliable standalone measure of spread.
Range Formula
Range = Highest value − Lowest value
Variance
A measure of how spread out scores are around the mean; calculated as the average of the squared deviations from the mean. Population variance uses N; sample variance uses N−1 in the denominator.
Standard Deviation
The square root of the variance; the most widely used measure of spread. It is expressed in the same units as the original data. A small SD means data are clustered near the mean; a large SD means data are more spread out.
Coefficient of Variation (CV)
The ratio of the standard deviation to the mean, expressed as a percentage. Used to compare variability between two data sets with different units of measurement. Formula: (Standard Deviation ÷ Mean) × 100.
Homogeneous Data
Data with low variability; scores are similar to one another.
Heterogeneous Data
Data with high variability; scores are dissimilar with considerable spread.
Empirical Rule
In a NORMAL (bell-shaped) distribution: approximately 68.26% of scores fall within ±1 SD of the mean; approximately 95% fall within ±2 SD; and approximately 99% fall within ±3 SD. Applies ONLY to normal distributions.
Chebyshev's Theorem
For ANY distribution shape (not just normal), at least 1 − (1/k²) of data values lie within k standard deviations of the mean, where k > 1. Example: k = 2 → at least 75% of values; k = 3 → at least 89% of values.
Variance for Grouped Data (Population)
Formula: σ² = [Sum of f(x − μ)²] ÷ N, where f = frequency, x = midpoint, μ = mean, N = total frequency.
Variance for Grouped Data (Sample)
Formula: s² = [Sum of f(x − x̄)²] ÷ (N − 1)
Shortcut Formula for Variance (Grouped Data)
Population: σ² = [Sum of m²f ÷ N] − μ². Sample: s² = [Sum of m²f − (Sum of mf)² ÷ N] ÷ (N − 1). NOTE: values from grouped data are approximations only.
Raw Data
Data collected and recorded in the sequence obtained, with no order, ranking, or grouping applied.
Frequency Distribution
A table that organizes data by listing each class or category along with the number of times (frequency) it appears. Makes large data sets more meaningful.
Frequency
The number of times a specific score or value appears in a data set.
Relative Frequency
The frequency of a class divided by the total number of scores in the distribution. All relative frequencies in a distribution must sum to 1.00.
Frequency Percentage
Relative frequency multiplied by 100. Expresses each class frequency as a percent of the total. All frequency percentages must sum to 100%.
Ungrouped Frequency Distribution
A listing of raw data or individual scores arranged from high to low (or low to high), with each score shown individually.
Grouped Frequency Distribution
A frequency distribution in which two or more different scores are combined into class intervals to summarize large data sets. Detail is lost but patterns become clearer.
Qualitative Data (Frequency Distribution)
A frequency distribution for non-numeric (categorical) data; lists all categories and the count of elements in each.
Quantitative Data (Frequency Distribution)
A frequency distribution for numeric data; scores are grouped into class intervals with associated frequencies.
Basic Table Format
A table organized with a stub (series) heading in the first column and column/category headings across the top; data cells contain the entries; totals and footnotes are added as needed.
Table Number
An optional label (e.g., Table 1 or Table 11.1) used to help readers locate and reference a specific table.
Table Title
A descriptive label for a table that clearly identifies what the data represent, the source of the data, and the time period covered. Abbreviations should be avoided or explained in footnotes.
Stub (Series) Heading
The heading of the first column of a table; indicates how the data are categorized (e.g., months, age ranges, score limits).
Column/Category Headings
Subheadings across the top of a table that describe the data in each column; should be clear and free of unexplained abbreviations.
Table Footnote
An explanatory note below a table used when codes, abbreviations, acronyms, or symbols cannot be avoided in the table's headings or cells.
Cumulative Frequency (cf)
The running total of frequencies, built by adding each class frequency to the sum of all previous class frequencies (from the bottom up). The entry for the top class equals the total N.
Class
A category into which a score can be placed in a frequency distribution; can be a single score or a range of scores (class interval).
Class Interval
A grouping of scores within defined lower and upper limits in a grouped frequency distribution; each score falls into only one interval.
Class Width
The size (span) of a class interval; calculated by dividing the range by the desired number of classes. Equal class widths are preferred for all classes.
Class Limits
The lower and upper boundary values stated for each class interval (e.g., 60–64). Each score can fall into only one class.
Class Boundaries
Decimal values representing the true limits of class intervals; they are 0.5 below the lower class limit and 0.5 above the upper class limit (e.g., for 60–64, boundaries are 59.5–64.5). Also called real, actual, or true class limits.
Class Midpoint
The middle value of a class interval; calculated as (lower limit + upper limit) ÷ 2. Used to represent all scores in that interval when computing grouped statistics.
Number of Classes Rule
A grouped frequency distribution should generally have between 5 and 20 classes (15 is often recommended as a practical target), depending on the size and nature of the data.
Percentile (Centile)
One of 100 equal divisions of a distribution. A score at the Pth percentile means P% of scores fall below that value. Divides the distribution into 100 equal segments.
Decile
One of 10 equal divisions of a distribution (each decile = 10 percentile points). The first decile = 10th percentile.
Quartile
One of 4 equal divisions of a distribution. Q1 = 25th percentile, Q2 = 50th percentile (median), Q3 = 75th percentile.
Calculating a Percentile — Ungrouped
Multiply the desired percentile (P) by the sample size (n) and divide by 100. The result gives the rank position of the percentile score in the ordered distribution.
Calculating a Percentile — Grouped
P = lower class boundary + [(desired rank position − cumulative frequency below interval) ÷ frequency of interval] × class width.
Percentile Rank of a Score
Indicates where a specific score falls relative to others. Formula: (Number of values below the score ÷ total number of values) × 100. Result is the percentile rank.
Weakness of Percentiles
Equal percentile intervals do NOT represent equal score differences. Scores tend to cluster near the middle, so score ranges between percentile levels may be unequal.
Qualitative Data
Data that cannot assume a numerical value; classified into two or more non-numeric categories (e.g., gender, blood type, stress level). Also called categorical data.
Quantitative Data
Data that can be measured numerically; includes both discrete and continuous data.
Discrete Data
Quantitative data whose values are countable and finite with no intermediate values (no fractions or decimals). Example: number of patients admitted.
Continuous Data
Quantitative data that can assume any numerical value over an interval. Example: body temperature, blood pressure.
Nominal Data
Qualitative data where observations are organized into categories with no recognized order. Example: eye color, type of surgery.
Ordinal Data
Qualitative data ordered in a meaningful way (ranked), but the intervals between ranks are not necessarily equal. Example: severity scale from minor to fatal; strongly agree to strongly disagree.
Ranked Data
A type of ordinal data where observations are arranged according to magnitude. Example: the 10 leading causes of death listed in order.
Chart
A graphic that illustrates data using only one quantitative coordinate; most appropriate for comparing discrete categories. Common types: bar, column, line, and pie charts.
Graph
A method of relating one variable to another quantitative variable (usually frequency). Common types: histogram and frequency polygon. Used for continuous quantitative data.
X-Axis (Horizontal Axis)
The horizontal reference line on a chart or graph; typically represents the primary (independent) variable. Lowest values are on the left; highest values on the right.
Y-Axis (Vertical Axis)
The vertical reference line on a chart or graph; represents the measurement variable (frequency, cost, count, etc.). Lowest values are at the bottom; should begin at zero.
Interrupted Vertical Scale
A broken line drawn on the Y-axis to indicate that the scale does not start at zero; prevents misleading the reader when a non-zero baseline is used.
Bar Chart
A chart with horizontal bars where the length of each bar represents a quantitative value. Best for categorical data when labels are too long to fit on a horizontal axis.
Column Chart
A chart with vertical bars where the height of each bar represents a quantitative value. Effective for showing increases or decreases across categories.
Single Bar/Column Chart
The simplest bar or column chart; displays one variable across categories. Space is left between bars to show that the data are not continuous.
Comparison (Multiple) Bar/Column Chart
A chart that plots two or more data series side by side on the same axes to allow direct visual comparison. Requires a legend; best limited to three series when five or more categories are shown.
Stack Bar/Column Chart
A chart in which data series are stacked on top of one another within each category bar; emphasizes the total value and the contribution of each series to that total.
Percent Stack Bar/Column Chart
A variant of the stacked bar chart in which all bars are standardized to 100%; shows the relative percentage contribution of each series within every category rather than absolute values.
Line Chart
A chart that illustrates patterns or trends in quantitative data over time. The preferred chart type for plotting time-series data. Can display multiple lines for comparison.
Multiple Comparison Line Chart
A line chart with two or more data series plotted on the same axes using different colors or line styles; requires a legend.
Y-Axis Scale Rule
The vertical scale of any chart or graph should always begin at zero; if it does not, a broken-line interruption must be used. Starting a non-zero scale without indication distorts the visual impression of the data.
Pictograph / Pictogram
A graphic display that uses representative pictures or symbols to depict quantitative data. Eye-catching but less precise than bar or line charts.
Histogram
A graph of a frequency distribution for quantitative continuous data. Bars are contiguous (no gaps), with class boundaries on the X-axis and relative frequency on the Y-axis. Height and width of each bar are both data-dependent.
Histogram vs. Bar Chart
A histogram displays continuous quantitative data (bars touch); a bar chart displays discrete or categorical data (bars have spaces between them).
Frequency Polygon
A line graph created by connecting the midpoints of histogram bars; used to display continuous quantitative data. Preferred over histograms when comparing two or more distributions on the same graph.
Guidelines for Constructing a Bar Chart
Arrange categories in natural order; use spaces between bars; limit series to three or fewer when five+ categories exist; use colors or patterns to distinguish series; label all axes clearly.
Guidelines for Constructing a Line Chart
Y-axis starts at zero; use an x:y ratio of approximately 5:3; label axes with variable names and units; distinguish multiple lines by color or style; include a legend.
Guidelines for Constructing a Histogram
Vertical scale starts at zero; use equal class interval sizes; height of Y-axis should be approximately 3:4 or 3:5 the length of the X-axis; class boundaries form the X-axis base; midpoints centered below bars.
Chart vs. Table -- When to Use
Charts/graphs convey information more quickly and visually but lack detail. Tables display large amounts of precise text-based quantitative data but are slower to read. Choice depends on audience and purpose.
Maximum Data Elements in a Chart/Graph
No more than five data elements or series should be displayed in a single chart or graph; elements with multiple subcategories should have fewer than five.
Research
A scholarly, systematic approach to scientific investigation; used to discover solutions to problems, verify knowledge, or establish new knowledge. Carried out in all fields of study.
Basic Research
Research that seeks to answer the question "why"; focused on expanding fundamental knowledge rather than solving a specific practical problem.
Applied Research
Research conducted to discover solutions to a specific problem or to improve a practical process or outcome.
Quantitative Research Approach
Uses numerical data that can be statistically analyzed; results are expressed as numbers, percentages, or rates. Example: counting coding errors.