2. Review of Descriptive Statistics

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/21

flashcard set

Earn XP

Description and Tags

Lecture 1

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

22 Terms

1
New cards

What is operationalization in psychological research?

Defining a variable/construct in terms of how it will be measured (e.g., stress → Perceived Stress Scale).

  • Used in psychometrics (measuring constructs)

  • No operationalization is perfect, always has limits

2
New cards

What are the four levels of measurement?

  • Nominal: categories, no order (e.g., gender, color)

  • Ordinal: ordered, but intervals not equal (e.g., rankings)

  • Interval: ordered, equal spacing, no true zero (e.g., Celsius)

  • Ratio: ordered, equal spacing, true zero (e.g., weight, time)

3
New cards

How does sample size affect the clarity of data distribution?

Large samples show clearer underlying distribution; small samples make it less obvious.

<p>Large samples show clearer underlying distribution; small samples make it less obvious.</p>
4
New cards

What are measures of central tendency and variability?

  • Central tendency: “sameness” (mean, median, mode)

  • Variability: “differentness” (range, variance, SD)

<ul><li><p><strong><mark data-color="#ff9d00" style="background-color: rgb(255, 157, 0); color: inherit;">Central tendency:</mark></strong> “sameness” (mean, median, mode)</p></li><li><p><strong><mark data-color="#d573fd" style="background-color: rgb(213, 115, 253); color: inherit;">Variability:</mark></strong> “differentness” (range, variance, SD)</p></li></ul><p></p>
5
New cards

What is the mode?

Most frequently occurring value in a dataset; appears as a peak in a histogram.

<p>Most frequently occurring value in a dataset; appears as a peak in a histogram.</p>
6
New cards

What is the median and how is it calculated?

Middle value of an ordered dataset.

  • If n odd → middle number

  • If n even → average of two middle numbers

  • Formula: (n+1)/2 for location

7
New cards

What is the mean and how is it calculated?

Arithmetic average = (sum of values) ÷ (number of values)

8
New cards

How does skew affect mean, median, and mode?

In skewed distributions, mean is pulled toward the skew, median is more stable, and mode reflects the peak.

<p>In skewed distributions, mean is pulled toward the skew, median is more stable, and mode reflects the peak.</p>
9
New cards

What is the range and what is its weakness?

Range = max – min. Very sensitive to extreme values (outliers).

The range is a better indicator of variation for Array A, because in Array B the range is distorted by outliers relative to the actual clustering.

<p>Range = max – min. Very sensitive to extreme values (outliers).</p><p></p><p>The <strong>range is a better indicator of variation for Array A</strong>, because in Array B the range is distorted by outliers relative to the actual clustering.</p>
10
New cards

What does IQR measure?

Range of the middle 50% of the data (Q3 – Q1).

<p>Range of the middle 50% of the data (Q3 – Q1).</p>
11
New cards

How do you calculate IQR step by step?

  1. Order data

  2. Find median

  3. Find Q1 (median of lower half) & Q3 (median of upper half)

  4. IQR = Q3 – Q1

<ol><li><p>Order data</p></li><li><p>Find median</p></li><li><p>Find Q1 (median of lower half) &amp; Q3 (median of upper half)</p></li><li><p>IQR = Q3 – Q1</p></li></ol><p></p>
12
New cards

What are variance and standard deviation?

  • Variance = average of squared deviations from mean

  • SD = square root of variance

  • Use squared differences so they don’t cancel out (since sum of deviations = 0)

<ul><li><p>Variance = average of squared deviations from mean</p></li><li><p>SD = square root of variance</p></li><li><p>Use squared differences so they don’t cancel out (since sum of deviations = 0)</p></li></ul><p></p>
13
New cards

What is a Z-score and why use it?

Standardized score = (value – mean) / SD.

  • Allows comparison across different scales/units

14
New cards

Why is standardization powerful for normal data? (AUC = 1)

Lets us calculate probabilities for scores:

  • 0 SD = 50%

  • 1 SD = 15.9%

  • 2 SD = 2.3%

  • 3 SD = 0.1%

<p>Lets us calculate probabilities for scores:</p><ul><li><p>0 SD = 50%</p></li><li><p>1 SD = 15.9%</p></li><li><p>2 SD = 2.3%</p></li><li><p>3 SD = 0.1%</p></li></ul><p></p>
15
New cards

How are descriptive and inferential statistics connected?

Inferential stats build on descriptive stats to make conclusions about populations.

<p>Inferential stats build on descriptive stats to make conclusions about populations.</p>
16
New cards

Why do we use n–1 (degrees of freedom) for sample variance?

Because sample mean is an estimate, using n–1 corrects bias when estimating population variance.

<p>Because sample mean is an estimate, using n–1 corrects bias when estimating population variance.</p>
17
New cards

What are degrees of freedom (df)?

  • The number of pieces of information that are free to vary when making an estimate.

  • Example: With 50 observations, estimating the mean has 50 df (all values are free).

  • When estimating standard deviation, you first need the mean, so you lose 1 df → 49 df.

18
New cards

What are the equations for sample mean, variance, and SD?

  • Mean = Σx / n

  • Variance = Σ(x–mean)² / (n–1)

  • SD = √variance

<ul><li><p>Mean = Σx / n</p></li><li><p>Variance = Σ(x–mean)² / (n–1)</p></li><li><p>SD = √variance</p></li></ul><p></p>
19
New cards

What happens if you use wrong df or small sample size?

Variance will be underestimated. Solution: increase sample size.

20
New cards

What does the CLT state?

Distribution of sample means approximates normal when sample size is large, regardless of population distribution.

<p>Distribution of sample means approximates normal when sample size is large, regardless of population distribution.</p>
21
New cards

What are the properties of the sampling distribution (CLT)?

  • Same mean as population

  • Smaller SD as sample size increases (SE decreases)

  • Shape approaches normal with larger n

22
New cards

How do we assess if a sample represents the population?

Check normality (e.g., Q-Q plots). CLT makes normality assumption less strict with large samples.