Statistics

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/50

There's no tags or description

Looks like no tags are added yet.

Last updated 11:34 AM on 5/9/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

51 Terms

New cards

Statistics

The science of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data.

New cards

Variable

A characteristic or attribute that can assume different values.

New cards

Data

Facts or information collected for reference or analysis.

New cards

Descriptive Statistics

Consists of the collection, organization, summarization, and presentation of data.

New cards

Inferential Statistics

Consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables, and making predictions.

New cards

Population

Consists of all subjects (human or otherwise) that are being studied.

New cards

Sample

A group of subjects selected from a population.

New cards

Statistic

A characteristic or measure obtained by using the data values from a sample.

New cards

Parameter

A characteristic or measure obtained by using all the data values from a specific population.

New cards

Numerical Data (Quantitative Data)

Data whose values are numbers or quantities (e.g. 3.5 years, 1.2 kg, 4 ms).

New cards

Categorical Data (Qualitative Data)

Data whose values are not numerical in nature but relate to categories (e.g. sex, eye colour, type of policy).

New cards

Discrete Data

Numerical data that can only take particular distinct numerical values.

New cards

Continuous Data

Numerical data that can take any numerical value in a specified range, e.g. (0,1) or (-∞, ∞).

New cards

Attribute (Dichotomous) Data

Categorical data values that have only two categories, e.g. claim/no claim, dead/alive, or male/female.

New cards

Nominal Data

Categorical data values that cannot be ordered in any natural way, e.g. type of insurance policy or nature of claim.

New cards

Ordinal Data

Categorical data values that can be ordered in a natural way, e.g. exam grades (A, B, C,…) or level of agreement.

New cards

Frequency Distribution

A table showing the number of times that each value in a data set has been observed; suitable for categorical or discrete numerical data.

New cards

Cumulative Frequency

The sum of all the frequencies up to and including the current point.

New cards

Bar Chart

A diagram used for discrete numerical or categorical data where a bar is drawn for each value to show its frequency.

New cards

Histogram

A diagram used for continuous data with no spaces between bars; the vertical axis shows frequency density, not frequency.

New cards

Frequency Density

Frequency divided by class width; used as the height of bars in a histogram so that bar area equals frequency.

New cards

Cumulative Frequency Curve (Ogive)

A graph constructed by plotting cumulative frequencies against the upper limit of each class and joining the points with a smooth curve.

New cards

Skewness

A measure of how symmetrical a data set is; the greater the asymmetry, the greater the magnitude of the skewness.

New cards

Positively Skewed Distribution

A distribution with a longer tail to the right; mode < median < mean.

New cards

Negatively Skewed Distribution

A distribution with a longer tail to the left; mode > median > mean.

New cards

Symmetric Distribution

A distribution where mode = median = mean.

New cards

Sample Mode

The value in a data set with the highest frequency; the value that occurs most often.

New cards

Modal Group

For grouped data, the class interval with the highest frequency.

New cards

Sample Mean Formula

x̄ = (x₁ + x₂ + … + xₙ) / n = (1/n) × Σxᵢ; obtained by summing all observations and dividing by the number of observations.

New cards

Mean from Frequency Distribution

x̄ = Σ(xᵢfᵢ) / Σfᵢ; used when data is presented as a frequency table using midpoints for grouped data.

New cards

Sample Median

The value that splits the data set into two equal halves when observations are ordered from smallest to largest.

New cards

Position of the Median

For n ordered observations, the median is at position (n+1)/2; if n is even, average the two middle values.

New cards

Lower Quartile (Q1)

The point one quarter of the way through the ordered data set; position = (n+1)/4.

New cards

Upper Quartile (Q3)

The point three quarters of the way through the ordered data set; position = 3(n+1)/4.

New cards

Interquartile Range (IQR)

A measure of spread equal to Q3 − Q1; not affected by extreme values.

New cards

Range

The difference between the largest and smallest values in a data set; easy to calculate but affected by extreme values.

New cards

Sample Variance Formula

s² = (1/(n−1)) × Σ(xᵢ − x̄)²; measures the average squared deviation from the mean.

New cards

Alternative Sample Variance Formula

s² = (1/(n−1)) × (Σxᵢ² − nx̄²); computationally more convenient than the deviation formula.

New cards

Sample Standard Deviation

s = √[( 1/(n−1)) × Σ(xᵢ − x̄)²]; has the same units as the data values.

New cards

Variance from Frequency Distribution

s² = (1/(n−1)) × (Σxᵢ²fᵢ − nx̄²); used when data is in a frequency table.

New cards

Effect of Adding a Constant on Location

If each value is increased by a constant a, the mode, mean, and median are each increased by a; spread measures are unchanged.

New cards

Effect of Multiplying by a Constant on Location

If each value is multiplied by b, the mode, mean, and median are each multiplied by b.

New cards

Effect of Multiplying by a Constant on Spread

If each value is multiplied by b: range, IQR, and standard deviation are multiplied by b; variance is multiplied by b²; skewness by b³.

New cards

Advantage of the Mode

Easy to calculate and not affected by extreme values.

New cards

Disadvantage of the Mode

May not be unique, may not exist, focuses on few values, and has no simple algebraic formula for further use.

New cards

Advantage of the Mean

Uses all data values and has mathematical properties useful in further calculations.

New cards

Disadvantage of the Mean

Can be distorted by extreme values (outliers).

New cards

Advantage of the Median

Not affected by extreme values.

New cards

Disadvantage of the Median

Does not use all data values and has no simple algebraic formula for further calculations.

New cards

Estimating the Median from Grouped Data

Identify the interval where cumulative frequency reaches n/2, then use linear interpolation to estimate the median value.

New cards

Estimating Quartiles from Grouped Data

For Q1, find the interval where cumulative frequency reaches n/4; for Q3, where it reaches 3n/4; then use linear interpolation.