Psychological Statistics Quiz #1

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/23

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 8:50 PM on 9/15/25
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

24 Terms

1
New cards

What is a variable?

A variable is any type of thing that you can categorize, measure, or count.

• a thing whose value varies between observations in your dataset.

2
New cards

Four Types of Variables:

• Nominal; Categorical... A nominal (or ā€œcategoricalā€) variable is qualitative. It varies by name, instead of by quantity (i.e., numerical value)...
• Examples?
• eye color... academic year... state of residence... covid status...
• Prior to statistical analysis, it is often critical to define the number of ā€œlevelsā€ of each nominal variable.


• For example, ā€œSexā€ would (most often) have two levels: male; female....
ā€œhair colorā€ might have four levels: blond, red, brown, black


Ordinal scales
• The order of values matters, but differences between values on the scale might not represent equivalent differences in the construct. But, you don’t know how close they were.
• E.g., 1st place, 2nd place, 3rd place
• E.g., 1=very unsatisfied; 2=somewhat unsatisfied; 3=neutral; 4=somewhat
satisfied; 5 = very satisfied

So like, maybe 1st and 2nd finished closer than 3rd and 4th, but that doesn’t matter in ordinal scales.

Interval variables
• Example:
• degrees Fahrenheit and degrees Celsius are both interval scales, because:
• 0o ≠ zero heat energy
• IF there is a True Zero, it’s a Ratio Variable (see next slide)...
• BUT, equal intervals (2 to 4, compared to 6 to 8) relate to equal differences in heat. by contrast, Kelvin is a Ratio Variable...

Ratio... Ratio variables are physical, real-world things that can be measured.
• Examples: Height... Weight... number of fingers... time to run 100 meters... Hours of sleep
• zero = zero of the thing... (ā€œTrue Zeroā€)
• zero weight = you don’t exist
• zero fingers = no fingers.
• Ratios are meaningful.
• 40o Farenheit ≠ 2*20o Farenheit... BUT, 40 Kelvin = 2*20 Kelvin

Zero means that there is nothing, 0° Farenheit and Celcius have negative values, so they are not ratio. Same thing with credit score: you can’t get to 0.

<p><span style="color: rgb(255, 254, 254);"><strong>• Nominal; Categorical.</strong>.. A nominal (or ā€œcategoricalā€) variable is qualitative. It varies by name, instead of by quantity (i.e., numerical value)...<br>• Examples?<br>• eye color... academic year... state of residence... covid status...<br>• Prior to statistical analysis, it is often critical to define the number of ā€œlevelsā€ of each nominal variable.</span></p><p><span style="color: rgb(255, 254, 254);"><br>• For example, ā€œSexā€ would (most often) have two levels: male; female....<br>ā€œhair colorā€ might have four levels: blond, red, brown, black</span></p><p><span style="color: rgb(255, 254, 254);"><br><strong>Ordinal scales</strong><br>• The order of values matters, but differences between values on the scale might not represent equivalent differences in the construct. But, you don’t know how close they were. <br>• E.g., 1st place, 2nd place, 3rd place<br>• E.g., 1=very unsatisfied; 2=somewhat unsatisfied; 3=neutral; 4=somewhat </span><span style="color: rgb(255, 252, 252);">satisfied; 5 = very satisfied</span></p><p><span style="color: rgb(255, 252, 252);">So like, maybe 1st and 2nd finished closer than 3rd and 4th, but that doesn’t matter in ordinal scales.</span></p><p></p><p><span style="color: rgb(255, 252, 252);"><strong>Interval variables</strong><br>• Example:<br>• degrees Fahrenheit and degrees Celsius are both interval scales, because:<br>• 0o ≠ zero heat energy<br>• IF there is a True Zero, it’s a Ratio Variable (see next slide)...<br>• BUT, equal intervals (2 to 4, compared to 6 to 8) relate to equal differences in heat. by contrast, Kelvin is a Ratio Variable...</span></p><p></p><p></p><p><span style="color: rgb(255, 255, 255);"><strong>Ratio... Ratio </strong>variables are physical, real-world things that can be measured.<br>• Examples: Height... Weight... number of fingers... time to run 100 meters... Hours of sleep<br>• zero = zero of the thing... (ā€œTrue Zeroā€)<br>• zero weight = you don’t exist<br>• zero fingers = no fingers.<br>• Ratios are meaningful.<br>• 40o Farenheit ≠ 2*20o Farenheit... BUT, 40 Kelvin = 2*20 Kelvin</span></p><p>Zero means that there is nothing, 0° Farenheit and Celcius have negative values, so they are not ratio. Same thing with credit score: you can’t get to 0. </p><p></p><p></p>
3
New cards

Two Common Uses of Statistics

1. Precisely describe/quantify the details of a specific a chunk of reality.

• Descriptive Statistics do this.'

• e.g., describe the mean and standard deviation of how many nightly hours of sleep PSYC 2320 students get

2. Use measurements of one small chunk of reality to draw probabilistic

conclusions about a MUCH LARGER chunk of reality.

• Inferential Statistics do this.

• e.g., based on the mean & SD of hours of sleep among PSYC 2320 students (sample), what is the 95% confidence interval of mean nightly hours of sleep among all Northeastern Students (population).

4
New cards

A ā€œparameterā€ vs a ā€œstatisticā€

A ā€œpopulationā€ vs a ā€œsampleā€

What’s the difference?

A parameter pertains to an entire population (e.g., 23% of US citizens)

• examples of populations:

• the entire US...

• the entire population of Northeastern students...

A statistic pertains to a sample of a population (e.g., 23% of survey respondents)

• examples of samples:

• our 2320 class is a relatively flawed sample of the entire population of Northeastern students

• our 2320 class could also be a very flawed sample of the entire US...

• NOT a random sample; NOT a sample that accurately represents the population

5
New cards

Descriptive vs. Inferential statistical analyses...

Descriptive statistics just describe your sample. Are exactly what they sound like:

Descriptive Statistics describe the dataset from which they came.

Examples of Descriptive Statistics:

• What is the average (mean) age of PSYC 2320 students?

• How many students have tested positive for covid this semester?

• What percentage of Princeton students drive a luxury car that was

purchased by their parents?

Inferential Statistics use your sample data to make probabilistic inferences about the population

Are much more ambitious...

They use your sample statistics (i.e., your descriptive statistics) to make probabilistic inferences about the population.

ALL research findings in your psychology textbooks (or any other science textbook)= summaries of inferential statistics!!

Example:
• Based on the proportion of participants who reported that Bill
Cosby was the most unlikable celebrity in America (M = 58%),
compared to the proportion who reported that Gwyneth Paltrow
was the most unlikable (M = 7%), we conclude that Bill Cosby is
significantly more unlikable than Gwyneth among American
college students, šœ’2 (1, N = 175) = 6.7, p < .01

6
New cards

What is Sampling Error

Imagine I collected data from you, and I found that
people who sit on the left side of the room are an
average of 1 inch taller than people who sit on the right
side of the room.


Which is more likely:
1. There is something about the left side of the room
that attracts taller people?
2. The 1-inch difference is due to random chaos

Sometimes the world is just weird, but replication is always helpful to figure out if it is a sampling error.

7
New cards

Correlational Design:

As X rises, does Y tend to rise or fall?

Tell me:

• X = studying.... Y = GPA

• X = sleep.... Y = anxiety

• X = height.... Y = how many vegetables you eat per year

Positive Correlation: higher values of X are associated with higher values of Y
• Also, lower values of X are associated with lower values of Y

Negative Correlation: higher values of X are associated with lower values of Y
• Lower values of X are associated with higher values of Y.

Zero Correlation: Values of X aren’t associated with values of Y.

A Correlation Coefficient (r) describes the correlation between
two variables as a number within the range: -1 → 0 → +1
• Positive Correlation: As r approaches +1, it means that X and Y
tend to rise and fall together
• Negative Correlation: As r approaches -1, it means that X and Y
tend to go in opposite directions
• Zero Correlation: As r approaches 0, no correlation between X and Y

<p><span style="color: rgb(254, 252, 252);">As X rises, does Y tend to rise or fall?</span></p><p><span style="color: rgb(254, 252, 252);"> Tell me:</span></p><p><span style="color: rgb(254, 252, 252);">• X = studying.... Y = GPA</span></p><p><span style="color: rgb(254, 252, 252);">• X = sleep.... Y = anxiety</span></p><p><span style="color: rgb(254, 252, 252);">• X = height.... Y = how many vegetables you eat per year</span></p><p></p><p><span style="color: rgb(254, 252, 252);"><strong>Positive Correlation:</strong> higher values of X are associated with higher values of Y</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">• Also, lower values of X are associated with lower values of Y</span></p><p></p><p><span style="color: rgb(254, 252, 252);"><strong>Negative Correlation: </strong>higher values of X are associated with lower values of Y</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">• Lower values of X are associated with higher values of Y.</span></p><p><span style="color: rgb(254, 252, 252);"><strong>Zero Correlation</strong>: Values of X aren’t associated with values of Y.</span></p><p></p><p><span style="color: rgb(254, 252, 252);"><strong>A Correlation Coefficient (r)</strong> describes the correlation between</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">two variables as a number within the range: -1 → 0 → +1</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">• <strong>Positive Correlation:</strong> As r approaches +1, it means that X and Y</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">tend to rise and fall together</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">• <strong>Negative Correlation: </strong>As r approaches -1, it means that X and Y</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">tend to go in opposite directions</span><span style="color: rgb(254, 252, 252);"><br></span><span style="color: rgb(254, 252, 252);">• <strong>Zero Correlation</strong>: As r approaches 0, no correlation between X and Y</span></p>
8
New cards

Quasi-Experimental Design (aka ā€œnon-equivalent groupsā€ design)

Comparison of naturally occurring groups. for example:
• Which group has higher emotional IQ: Men or Women?
• Which group drinks more beer: Freshmen or Seniors?
• Which group is faster at sprinting: Chinese or Japanese?
• Note: these are not worthwhile questions; they’re just valid examples of this type of design

essentially, when you can’t assign people to different groups.


Like correlational design, quasi-experimental design cannot determine
whether changes in one variable causes changes in the other variable...
Why?

9
New cards

Third Variable Problem

Third Variable Problem: When a secret third variable influences both of
two variables that are significantly correlated
• Neither of the two correlated variables causes the other
• Changes in both variables are caused by the secret third variable

Think of heartburn study where they failed to consider that diet or sleep could be causing the drugs to be linked with mortality


Ice cream sales and shark attacks are positively correlated, the third variable in that situation is the heat.

<p><span style="color: rgb(245, 244, 244);"><strong>Third Variable Problem: </strong>When a secret third variable influences both of</span><span style="color: rgb(245, 244, 244);"><br></span><span style="color: rgb(245, 244, 244);">two variables that are significantly correlated</span><span style="color: rgb(245, 244, 244);"><br></span><span style="color: rgb(245, 244, 244);">• Neither of the two correlated variables causes the other</span><span style="color: rgb(245, 244, 244);"><br></span><span style="color: rgb(245, 244, 244);">• Changes in both variables are caused by the  secret third variable</span></p><p></p><p><span style="color: rgb(245, 244, 244);">Think of heartburn study where they failed to consider that diet or sleep could be causing the drugs to be linked with mortality </span></p><p><br><span style="color: rgb(245, 244, 244);">Ice cream sales and shark attacks are positively correlated, the third variable in that situation is the heat. </span></p>
10
New cards

THE PURPOSE OF SCIENCE: + IV and DV

1. Identify cause-and-effect relationships between phenomena.

2. Use [1] to make predictions about and/or influence the future.

Within a scientific study...

... the ā€œcauseā€ = the independent variable

... the ā€œeffectā€ = the dependent variable

11
New cards

Experimental Research (aka True Experimental Design)

In a true experimental design, participants are RANDOMLY ASSIGNED
to different conditions of the experiment
• Conditions should be identical, except for the ndependent Variable (IV)
• Each condition of an experiment represents a different ā€œlevelā€ of the IV.

Why is RANDOM ASSIGNMENT so critical in True Experimental Design?
If participants are randomly assigned to different levels of the independent
variable (IV) -- i.e., to different conditions...


... and if conditions are identical, aside from differences in the IV....
... then any differences between conditions in the dependent variable (DV)
must be due to differences in the independent variable.


This is why True Experimental Design is THE ONE type of study that can determine causality.

12
New cards

Proportion and Percentage

proportion = f/N = frequency of [some value] / total observations

What proportion of this dataset is X = 2?

f = There are 3 observations of X = 2

N = [1+2+3+3+1] = 10 total observations

proportion = f/N =

percentage = p(100)

p = f/N = 3 / 10 = 0.3

30% of our observations of X in this dataset were X = 2

For what percentage of the dataset was X = 2?

= 0.3(100) = 30%

<p>proportion = f/N = frequency of [some value] / total observations</p><p>What proportion of this dataset is X = 2?</p><p>f = There are 3 observations of X = 2</p><p>N = [1+2+3+3+1] = 10 total observations</p><p>proportion = f/N =</p><p></p><p>percentage = p(100)</p><p>p = f/N = 3 / 10 = 0.3</p><p>30% of our observations of X in this dataset were X = 2</p><p></p><p>For what percentage of the dataset was X = 2?</p><p>= 0.3(100) = 30%</p><p></p>
13
New cards

Percentile Rank

The percentile rank of a particular score is defined as the percentage of individuals in the distribution with scores at or below this score.

The cumulative frequency (cf) of a target value is the total number of values in the dataset that are equal or less than the target value

Percentile (c%) = cumulative frequency (cf) divided by the total number of values in the dataset (N)

c% = cf/N x 100

<p>The percentile rank of a particular score is defined as the percentage of individuals in the distribution with <strong>scores at or below this score.</strong></p><p></p><p>The <strong>cumulative frequency (cf) </strong>of a target value is the total number of values in the dataset that are equal or less than the target value</p><p>Percentile (c%) = cumulative frequency (cf) divided by the total number of values in the dataset (N)</p><p>c% = <em>cf/</em>N x 100</p><p></p>
14
New cards

The Dunning-Kruger Effect

The red line with circles shows the percentile-rank on various intellectual
tasks for four groups of participants

The teal line with triangles shows participants’ estimates of their own percentile ranks
• i.e., their estimate of how well they did compared to others

The Dunning-Kruger Effect is basically that the people who did the worst overestimate how they did and the people who did the best underestimated themselves

<p><span style="color: rgb(255, 254, 254);">The red line with circles shows the percentile-rank on various intellectual<br>tasks for four groups of participants<br></span></p><p><span style="color: rgb(255, 254, 254);">The teal line with triangles shows participants’ estimates of their own percentile ranks<br>• i.e., their estimate of how well they did compared to others</span></p><p><span style="color: rgb(255, 254, 254);">The Dunning-Kruger Effect is basically that the people who did the worst overestimate how they did and the people who did the best underestimated themselves</span></p>
15
New cards

Frequency Distribution

A Frequency Distribution shows how frequently each value occurs within a dataset

<p>A Frequency Distribution shows how frequently each value occurs within a dataset</p>
16
New cards

Histogram

A Histogram is a bar graph of a frequency distribution. For example

<p>A Histogram is a bar graph of a frequency distribution. For example</p>
17
New cards

Assessing Central Tendency

Central Tendency is a statistical measure to determine a single score that

defines the center of a distribution.

• Mean = the average = sum of the scores / number of scores

• population mean = šžµ = Ī£X / N

• sample mean = M = Ī£X / n

• Know and recognize all possible symbols for the mean: šžµ, M, xĢ…

Greek letter → population parameter

regular lettering → sample statistic

• Median = the ā€œmiddleā€ value in the dataset...

Mode: the most common value

When to use mean and median?

When the data obey a NORMAL DISTRIBUTION. aka a bell curve USE THE MEAN

A LOT of variables are normally distributed (roughly). Thus, the mean is the most common measure of central tendency

In a perfectly normal distribution, mean = median = mode

What about the median?

When is the median a better measure of central tendency?
• When there are major outliers (especially within small datasets)...
• ... the mean is sensitive.
• ... the median doesn’t care.


• When the data are skewed
• Positively skewed data: the mean will be ā€œtoo highā€
• Negatively skewed data: the mean will be ā€œtoo lowā€

<p><span style="color: rgb(255, 249, 249);"><strong>Central Tendency</strong> is a statistical measure to determine a single score that</span></p><p><span style="color: rgb(255, 249, 249);">defines the center of a distribution.</span></p><p><span style="color: rgb(255, 249, 249);">• Mean = the average = sum of the scores / number of scores</span></p><p><span style="color: rgb(255, 249, 249);">• population mean = šžµ = Ī£X / N</span></p><p><span style="color: rgb(255, 249, 249);">• sample mean = M = Ī£X / n</span></p><p><span style="color: rgb(255, 249, 249);">• Know and recognize all possible symbols for the mean: šžµ, M, xĢ…</span></p><p><span style="color: rgb(255, 249, 249);">Greek letter → population parameter</span></p><p><span style="color: rgb(255, 249, 249);">regular lettering → sample statistic</span></p><p></p><p></p><p></p><p><span style="color: rgb(255, 249, 249);">• Median = the ā€œmiddleā€ value in the dataset...</span></p><p></p><p><span style="color: rgb(255, 249, 249);">Mode: the most common value</span></p><p></p><p><span style="color: rgb(255, 249, 249);">When to use mean and median?</span></p><p><span style="color: rgb(255, 249, 249);">When the data obey a NORMAL DISTRIBUTION. aka a bell curve USE THE MEAN</span></p><p><span style="color: rgb(255, 249, 249);">A LOT of variables are normally distributed (roughly). Thus, the mean is the most common measure of central tendency</span></p><p><span style="color: rgb(255, 249, 249);">In a perfectly normal distribution, mean = median = mode</span></p><p></p><p><span style="color: rgb(255, 249, 249);">What about the median?</span></p><p><span style="color: rgb(255, 249, 249);">When is the median a better measure of central tendency?</span><span style="color: rgb(255, 249, 249);"><br></span><span style="color: rgb(255, 249, 249);">• <strong>When there are major outliers </strong>(especially within small datasets)...</span><span style="color: rgb(255, 249, 249);"><br></span><span style="color: rgb(255, 249, 249);">• ... the mean is sensitive.</span><span style="color: rgb(255, 249, 249);"><br></span><span style="color: rgb(255, 249, 249);">• ... the median doesn’t care.</span></p><p><span style="color: rgb(255, 249, 249);"><br></span><span style="color: rgb(255, 249, 249);"><strong>• When the data are skewed</strong></span><span style="color: rgb(255, 249, 249);"><br></span><span style="color: rgb(255, 249, 249);">• Positively skewed data: the mean will be ā€œtoo highā€</span><span style="color: rgb(255, 249, 249);"><br></span><span style="color: rgb(255, 249, 249);">• Negatively skewed data: the mean will be ā€œtoo lowā€</span></p>
18
New cards

Showcase of distributions with different numbers of peaks (modes)

knowt flashcard image
19
New cards

Skewing

Positive skew: he mean is skewed positively. It is a higher (more positive) value

than the center of the distribution

Negative skew is the opposite of that

<p>Positive skew: he mean is skewed positively. It is a higher (more positive) value</p><p> than the center of the distribution</p><p>Negative skew is the opposite of that</p>
20
New cards

What to do when summarizing data?

In a nutshell, when summarizing data, do this:

1. Calculate both the mean and the median

2. If they’re similar, then just talk about the mean when summarizing your data

3. If they’re dissimilar, talk about both the mean and the median, and talk about why they’re dissimilar (outliers? Skew?)

21
New cards

Variability

The divergence of data from its mean value

The difference: variability:

low variability -- scores are consistent.

high variability -- scores are all over the place.

22
New cards

Measures of variability: Variance & Standard Deviation

Variance = SS/(N-1)

Standard Deviation: Sqrt(Variance)

Where SS is squared Deviations

<p>Variance = SS/(N-1)</p><p></p><p>Standard Deviation: Sqrt(Variance)</p><p>Where SS is squared Deviations</p>
23
New cards

Z- score

A numerical measurement that describe’s a value’s relationship to the mean of a group of values and is measured in terms of standard deviations from the mean

To calculate the z-score of a raw value, when you’re given the raw value, the mean, and the standard deviation

To find a raw value, when you’re given a z-score, the mean, and the standard deviation:

These formulas only work for normally distributed data!

standard deviation = a universal unit of measurement.

z = 1 → the observation is 1 standard deviation above the mean.

z = -3 → the observation is 3 standard deviations below the mean.

This is useful because... ā€œThe standard amountā€ by which a given value in the distribution tends to differ from the mean

<p>A numerical measurement that describe’s a value’s relationship to the mean of a group of values and is measured in terms of standard deviations from the mean</p><p></p><p>To calculate the z-score of a raw value, when you’re given the raw value, the mean, and the standard deviation</p><p>To find a raw value, when you’re given a z-score, the mean, and the standard deviation:</p><p><strong>These formulas only work for normally distributed data!</strong></p><p></p><p>standard deviation = a universal unit of measurement.</p><p>z = 1 → the observation is 1 standard deviation above the mean.</p><p>z = -3 → the observation is 3 standard deviations below the mean.</p><p>This is useful because... ā€œThe standard amountā€ by which a given value in the distribution tends to differ from the mean</p><p></p><p></p>
24
New cards

The 68-95-99.7 rule and the z score percentages I have to memorize

knowt flashcard image