Lecture Notes on Descriptive Statistics

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/23

Earn XP

Description and Tags

Flashcards for vocabulary review of descriptive statistics concepts.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

24 Terms

New cards

Descriptive Statistics

To summarise and represent data in a way that humans can easily interpret. It includes techniques like central tendency, dispersion, and association.

New cards

Measures of Central Tendency

Mean, Median, Mode

New cards

Mean

Arithmetic average: sum of values divided by number of observations. Used with interval/ratio data. Affected by outliers.

New cards

Median

Middle value when data is ordered. Best used when data is skewed or has outliers. Appropriate for ordinal and continuous data.

New cards

Mode

Most frequently occurring value. Only option for nominal data. Can be used when grouping continuous data into categories.

New cards

Statistical Power of the Mean

It includes all data points in the calculation and is used in inferential tests.

New cards

Measures of Dispersion

Range, Interquartile Range (IQR), Standard Deviation (SD), Coefficient of Variation (CV)

New cards

Range

Difference between highest and lowest values. Simple but sensitive to outliers.

New cards

Interquartile Range (IQR)

Spread of the middle 50% of data. Calculated as Q3 - Q1. Used with box plots.

New cards

Variance

Average of squared deviations from the mean. Not directly interpretable (unit is squared).

New cards

Standard Deviation (SD)

Square root of variance. Indicates average deviation from the mean. Used in inferential tests.

New cards

Coefficient of Variation (CV)

SD divided by the mean × 100. Compares relative spread between datasets. Useful for comparing different variables or time periods.

New cards

Why Square Deviations in Variance?

To avoid positive and negative values cancelling each other out.

New cards

Interpreting SD

Smaller SD = data is tightly clustered around mean. Larger SD = data is more spread out.

New cards

Chi-Squared Test

Testing association between two categorical variables. Compares observed vs expected frequencies. Uses degrees of freedom (df) and critical value tables.

New cards

Pearson’s Correlation Coefficient (r)

Measures strength and direction of linear relationship between two continuous variables. Ranges from -1 (perfect negative) to +1 (perfect positive). 0 = no linear relationship.

New cards

Correlation vs. Causation

Correlation shows a relationship. Causation implies one variable causes another which correlation does not prove.

New cards

Skewness

Indicates asymmetry in a distribution. Positive skew: tail to the right (mean > median). Negative skew: tail to the left (mean < median).

New cards

Normal Distribution

Bell-shaped, symmetrical. Mean = Median = Mode. Follows empirical rule (68%-95%-99.7% within 1, 2, 3 SDs)

New cards

Empirical Rule

68% of data within ±1 SD, 95% within ±2 SD, 99.7% within ±3 SD

New cards

Importance of Normality

Many inferential tests assume normally distributed data.

New cards

Box Plot Usefulness

Visualising median, IQR, and identifying outliers. Helps assess symmetry/skew.

New cards

When to Use Median Instead of Mean

When the data is skewed or contains outliers.

New cards

Best Data for CV

Ratio-level data with a meaningful zero.