2. Review of Descriptive Statistics

0.0(0)

Studied by 0 people

Call with Kai

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/21

Earn XP

Description and Tags

Lecture 1

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

22 Terms

New cards

What is operationalization in psychological research?

Defining a variable/construct in terms of how it will be measured (e.g., stress → Perceived Stress Scale).

Used in psychometrics (measuring constructs)
No operationalization is perfect, always has limits

New cards

What are the four levels of measurement?

Nominal: categories, no order (e.g., gender, color)
Ordinal: ordered, but intervals not equal (e.g., rankings)
Interval: ordered, equal spacing, no true zero (e.g., Celsius)
Ratio: ordered, equal spacing, true zero (e.g., weight, time)

New cards

How does sample size affect the clarity of data distribution?

Large samples show clearer underlying distribution; small samples make it less obvious.

New cards

What are measures of central tendency and variability?

Central tendency: “sameness” (mean, median, mode)
Variability: “differentness” (range, variance, SD)

<ul><li><p><strong><mark data-color="#ff9d00" style="background-color: rgb(255, 157, 0); color: inherit;">Central tendency:</mark></strong> “sameness” (mean, median, mode)</p></li><li><p><strong><mark data-color="#d573fd" style="background-color: rgb(213, 115, 253); color: inherit;">Variability:</mark></strong> “differentness” (range, variance, SD)</p></li></ul><p></p>

New cards

What is the mode?

Most frequently occurring value in a dataset; appears as a peak in a histogram.

New cards

What is the median and how is it calculated?

Middle value of an ordered dataset.

If n odd → middle number
If n even → average of two middle numbers
Formula: (n+1)/2 for location

New cards

What is the mean and how is it calculated?

Arithmetic average = (sum of values) ÷ (number of values)

New cards

How does skew affect mean, median, and mode?

In skewed distributions, mean is pulled toward the skew, median is more stable, and mode reflects the peak.

New cards

What is the range and what is its weakness?

Range = max – min. Very sensitive to extreme values (outliers).

The range is a better indicator of variation for Array A, because in Array B the range is distorted by outliers relative to the actual clustering.

New cards

What does IQR measure?

Range of the middle 50% of the data (Q3 – Q1).

New cards

How do you calculate IQR step by step?

Order data
Find median
Find Q1 (median of lower half) & Q3 (median of upper half)
IQR = Q3 – Q1

<ol><li><p>Order data</p></li><li><p>Find median</p></li><li><p>Find Q1 (median of lower half) & Q3 (median of upper half)</p></li><li><p>IQR = Q3 – Q1</p></li></ol><p></p>

New cards

What are variance and standard deviation?

Variance = average of squared deviations from mean
SD = square root of variance
Use squared differences so they don’t cancel out (since sum of deviations = 0)

<ul><li><p>Variance = average of squared deviations from mean</p></li><li><p>SD = square root of variance</p></li><li><p>Use squared differences so they don’t cancel out (since sum of deviations = 0)</p></li></ul><p></p>

New cards

What is a Z-score and why use it?

Standardized score = (value – mean) / SD.

Allows comparison across different scales/units

New cards

Why is standardization powerful for normal data? (AUC = 1)

Lets us calculate probabilities for scores:

0 SD = 50%
1 SD = 15.9%
2 SD = 2.3%
3 SD = 0.1%

New cards

How are descriptive and inferential statistics connected?

Inferential stats build on descriptive stats to make conclusions about populations.

New cards

Why do we use n–1 (degrees of freedom) for sample variance?

Because sample mean is an estimate, using n–1 corrects bias when estimating population variance.

New cards

What are degrees of freedom (df)?

The number of pieces of information that are free to vary when making an estimate.
Example: With 50 observations, estimating the mean has 50 df (all values are free).
When estimating standard deviation, you first need the mean, so you lose 1 df → 49 df.

New cards

What are the equations for sample mean, variance, and SD?

Mean = Σx / n
Variance = Σ(x–mean)² / (n–1)
SD = √variance

<ul><li><p>Mean = Σx / n</p></li><li><p>Variance = Σ(x–mean)² / (n–1)</p></li><li><p>SD = √variance</p></li></ul><p></p>

New cards

What happens if you use wrong df or small sample size?

Variance will be underestimated. Solution: increase sample size.

New cards

What does the CLT state?

Distribution of sample means approximates normal when sample size is large, regardless of population distribution.

New cards

What are the properties of the sampling distribution (CLT)?

Same mean as population
Smaller SD as sample size increases (SE decreases)
Shape approaches normal with larger n

New cards

How do we assess if a sample represents the population?

Check normality (e.g., Q-Q plots). CLT makes normality assumption less strict with large samples.