1/267
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
what is the belief bias effect
a bias in logical reasoning whereby when deciding whether a particular argument is logically valid we tend to be influenced by the believability of the conclusion, even when we shouldn’t
what is simpsons paradox
a statistical phenomenon where a trend or relationship observed in several distinct groups of data reverses or disappears when the groups are combined. This occurs when a confounding variable influences the data
what are the three reasons every psychologist should know statistics
statistics is deeply intertwined with research design
if you really want to understand the psychology, you need to be able to understand what other people did with their data → and this involves understanding statistics.
statistical analysis is expensive
what is the difference between research design and statistics
statistics has a kind of universality.
research design is idiosyncratic and is specific to the area of research that you want to engage in
what is a theoretical construct
the thing that you are trying to take a measurement of → it cannot be directly observed
what is a measure
the measure refers to the method or the tool that you use to make your observations
what is a variable
a variable is what we end up with when we apply our measure to something in the world
variables are the actual data we end up with → the outcome of a psychological measurement
what is a continuous variable
a continuous variable is one in which, for any two values, it is always logically possible to have another value in between (for example reaction time)
what is a discrete variable
a variable that isn’t continuous. For a discrete variable it is sometimes the case that there is nothing in the middle. It is a numeric variable that can only take on specific, distinct values within a given range (for example, number of children in a family)
what is test-retest reliability
consistency over time
if we repeat the measurement at a later date do we get the same answer
inter-rater reliability
consistency across people
if someone else repeats the measurement will they produce the same answer
parallel forms reliability
consistency across theoretically equivalent measurements
if I use a different set of bathroom scales to measure my weight does it give the same answer
internal consistency reliability
if a measurement is constructed from lots of different parts that perform similar functions (ie, a personality questionnaire result is added up across several questions) do the individual parts tend to give similar answers,
internal validity
the extent to which you are able to draw the correct conclusions about the causal relationship between variables. Refers to the relationships between things inside the study
external validity
relates to the generalisability of your findings
to what extent do you expect to see the same pattern of results in real life as you saw in your study
construct validity
a question of whether you are measuring what you want to be measuring
A measurement has good construct validity if it its actually measuring the correct theoretical construct, and bad construct validity if it does not
face validity
refers to whether or not a measure looks like it is doing what it is supposed to, nothing more.
a lot of people will use face validity as a proxy for real validity (even though it isn’t)
ecological validity
in order to be ecologically valid, the entire set up of the study should closely approximate the real world scenario that is being investigated (similar to external validity)
what is an artefact
a result is said to be ‘artefactual’ if it only holds in the special situation that you happened to test in your study → poses a threat to external validity
what is the repeated testing effect
a type of history effect whereby the ‘event’ that influences the second measurement is the first measurement itself
differential attrition
a threat to the internal validity of a study, occurring when participant dropout rates differ significantly between the treatment and control groups
this is a kind of selection bias that is caused by the study itself.
regression to the mean
any situation where you select data based on an extreme value on some measure and if a sample point is extreme (unusually high or low), the next measurement of that same variable will likely be closer to the average
what is a population
populations are the groups of people about whom we want to draw conclusions about → a broader human reality (ie. all adults with depression) → the population is some defined group of people about whom you’re trying to say something true
populations are almost never directly observable (ie. you cannot study every adult) → the population is a theoretical entity, one you cannot fully access
how do we solve the unobservable populations issue
we draw a sample → what we observe in that sample we try to draw conclusions about the broader population
how would we ideally draw a sample
we would ideally use random sampling → whereby every person has an equal chance of being chosen (this reduces bias and improves external validity) → the people you randomly chose are not systematically different from the people you didn’t
what is convenience sampling
in psychology, true random sampling is rare. We often work with convenience sampling → uses whoever was available or willing, such as undergraduate psychology students as participants because they’re easy to recruit and cost-effective (this can introduce bias as the sample may not be representative)
what is the sample mean shown by
x̄
what is the population mean shown by
mu → μ
the population mean is generally unknown
you wouldn’t expect the sample mean to equal the population mean exactly. It is just an estimate of the population mean
what is the sample mean
the sample mean is the best estimate of mu
the entire question around estimation is → given that you know the sample mean, what can you reasonably conclude about the population mean
the difference between sample and population is the foundation of all inferential statistics
what is sampling variability
sampling variability is the idea that different samples from the same population give different results just by chance.
you cannot eliminate sampling variability → it is a property of samples themselves
your sample is just one location is a landscape of all possible samples. The mean of your sample is a precise number but as a population mean, it is surrounded by uncertainty → this is sampling variability.
what are the two properties of an estimate
accuracy → how close is the estimate to the true value?
precision → how consistent is the estimate?
accuracy and precision are independent of each (ie. you can be accurate and not precise and vice versa)
precision is about stability → a more precise estimate is one where repeated samples would give you similar means.
what does sampling variability question in an estimate
sampling variability questions the precision of an estimate. The precision of your estimate is a function of your sample size and the variability in your data.
how do you have to think to find how precise your estimate is
You have to think about what would happen if you repeated the study and sampling variability
what is the confidence interval
the range or uncertainty interval around your estimate
(ie. i believe that due to sampling variability and the sample of 200, the true value would lie somewhere between 68-76)
a confidence interval is a way of communicating precision.
ie. Here is my best estimate of the population parameter and here is the range in which I’m reasonably confident it lies.
what does the width of the confidence interval tell you about the precision of your estimate
narrow = precise (sampling variability is small)
wide = imprecise (sampling variability is large)
the confidence interval is not about how spread out your individual data points are. it is about how much your sampling estimate could plausibly vary if you repeated the study
what does the width of the confidence interval depend on
the variability of the data → how much do individuals differ from each other?
the size of the sample (n)
desired level of confidence → how confident do you want to be
what is the best way to reduce uncertainty arond sampling variability
test more people.
A larger sample reduces sampling variability (it doesn’t eliminate it)
what is the sampling distribution (simply put)
describes the universe of possible sample means you could have gotten if you repeated the experiment, for example, 1000 times.
the sampling distribution answers the questions; if I ran my study again, what could happen?
the sampling distribution exists logically in a mathematical sense.
It helps us establish confidence intervals, and make inferences
what is the sampling distribution (in a statistical sense)
the sample distribution is the distribution of a sample statistic (ie. the mean, SD, r) across all possible samples of a given sample size (n) from a given population
the sampling distribution is theoretical, as you can never actually run your study 1,000 times or across all possible samples of a population.
every sample you take is one point from this distribution (ie. your sample of 200 people with a mean of 72 is one draw from the sampling distribution of means for samples of size 200 from this population)
the sampling distribution depends on what the true population is (ie. if the true population mean is 70, the sampling distribution will centre around 70)
what are the properties of the sampling distribution determined by
the true population distribution
size of the sample (n)
the observed statistic (ie. mean, SD, correlation)
the sample distribution is approximately normal (has a bell shaped curve) and this is true even if the underlying population is not normally distributed. This is called the central limit theorem
what is the central limit theorem
the central limit theorem is a profound result.
the central limit theorem says that even though the underlying population is skewed, the distribution of those 1,000 means will be approximately normal
this means we can use the properties of the normal distribution to reason about sample means, even if they underlying data is not normally distributed.
what is the explanation behind the central limit theorem
when you average across multiple individuals, the quirks of any one individual gets smoothed out
If you sample someone from a skewed income distribution, you might get someone very high earning and that one data point would be extreme. But if you sample 100 people and average them, you’re much less likely to get an extreme average because you need many extreme values all in the same sample for the average to be extreme
the more people you include in the average, the more this smoothing out of values happens. With 100 people per sample, the distribution of means is already quite normal. With 1,000 people per sample, it is extremely close to normal.
why is the sampling distribution of the mean an unbiased estimator
because the sampling distribution of the mean is centred on the true population mean
on average, across many hypothetical repetitions, it hits the target.
what is the standard error
the standard error (SE) is the standard deviation of the sample distribution.
the standard error is a measure of spread of sample means. It measures how much/often the sample mean bounces around from one hypothetical repetition of the study to the next.
what is the formula for standard error
SE = SD/√n
whereby SD is the standard deviation of your sample (this reflects how variable the underlying measure is, if there is a lot of variability SD is high), and
n is your sample size (a larger sample means a smaller SE, which means a more precise estimate, precision gets more expensive as you go as you need to quadruple your sample size to halve the SE and double your precision) .
how do you read SE
a larger SD means a larger SE → more variability in the population means more variability in your samples.
you cannot see the SE from your data. It is a property of what would happen if you repeated your study → it is theoretical → it quantifies the uncertainty in your estimates
often the SE is small compared to the SD this is because the average a group of people (ie. 500) is much more stable than individual scores.
what is the difference between SD and SE
SE → asks how precise is my sample mean as an estimate of the true population mean → it is the spread of sample means
SD → asks what is the typical spread of individual scores in my sample → it is the spread of your data
how do you construct a confidence interval
take your sample mean and add and subtract about 2 standard errors and times this by the standard error of the sample. Given by:
x̄+or - 2 X SE = CI
this range is your 95% confidence interval and says that, based on my sample, I am 95% confident that the true population mean lies somewhere between this range
why do we add or subtract 2 standard errors to find the confidence interval
it is two SE because in a normal distribution, about 95% of the values fall within two SD of the mean.
So if the sampling distribution has the same SD equal to the SE, then about 95% of the sample means fall within two SE of the true population means
what is the difference between the histogram of your sampling data and your sampling distribution
the histogram of your sampling data is not the sampling distribution. It is the distribution of individual scores in your one sample. They are two completely different things.
the sampling distribution is theoretical, your histogram is concrete.
the sampling distribution is the formal way of describing the uncertainty sampling provides. The sampling distribution describes the space that your data could have turned out
what is the problem with just reporting a p value
when a reporter only gives the p value, you know the effect probably exists, but you don’t know if its tiny or enormous or how precisely it has been estimated, or what would happen if the study was repeated.
what did Cumming mean by ‘new statistics’
By new statistics, Cumming meant put estimation first. Let the CI tell the story. Report effect sizes always and be sceptical of studies that report only a p-value.
He called for a shift from the null hypothesis significance testing (NHST) which has been the primary of mode of inference to estimation, which includes reporting effect sizes and confidence intervals every time.
what is the meta-analytic mindset
think of every study as one data point in a larger ongoing literature. A single study rarely tells you anything, but a combination of studies through a meta-analytic study helps tell you more. This is why it is important to report estimation.
what does randomness mean
randomness produces variability and variability looks like patterns even when nothing is actually happening
when we say something is random, it means we cannot predict the outcome of any single trial
randomness is structured and produces variability, which can look like there is something happening even when nothing is. It is a pattern created by chance
what is probability defined as in psychology
probability in psychology and statistics is defined as the long run frequency.
Probability is not a property of a single event, it is a property of what happens when you repeat an event very many times. It is the proportion of times an event occurs across many, many repetitions.
what is sampling variability
sampling variability (or sampling error) is what we call the differences you see in data when nothing is actually changing, just because you took a different sample.
It is the extent to which a sample statistics differs from sample to sample, due to the random process of selecting participants.
It is not a failure of the methodology, it is not a sign that something went wrong, it is the inevitable consequence of sampling from a population that contains variation.
why are striking results in a small study less trustworthy
Because sampling variability is larger in small samples. Larger samples have less sampling variability → so results will be closer to the true effect.
striking results bounce around more than modest results. We have to look at multiple replications of the study or different samples etc. to properly find the true effect and account for sampling variability.
one striking result is not the truth, it is one draw from a distribution of possible results that could have occurred.
what is the null world and what question does statistical testing aim to answer
the null world is an imaginary world where nothing is happening → there is no treatment effect, no group differences, no relationship between variables. in this world, any effects you observe you observe are purely the result of sampling variability, they are noise.
when flipping a coin, in this null world, the coin is completely unbiased and fair. You would occasionally see patterns that look like the coin is biased (ie. 9 heads and 1 tails), not often but sometimes.
the question that statistical testing answers in regards to this is; given the outcome I observed how surprising would this be in the null world?
what are some of the ways human intuition fail to see randomness as randomness
gambler’s fallacy
hot hand fallacy
gambler’s fallacy
the erroneous belief that past random outcomes affect the probability of future independent events
(ie. the belief that after having so many heads, tails is due to be the next flip/more likely)
hot hand fallacy
the erroneous belief that a run of recent successes makes future success more likely, when the underlying events are actually independent
(ie. if a basketball player makes five shots in a row we believe they are more likely to make the next shot, however this is an illusion they are experiencing a lucky streak)
what is the consequence of these two fallacies
has a direct consequence for how we read results → when we see a striking result, we think it is evidence of a real phenomenon, and sometimes it is, or sometimes it is a hot hand, or sometimes is the the natural variability of sampling showing us an extreme value.
the job of statistics is to formalise this intuition, to give us a language for saying, in the null world, how often would I see an outcome this extreme just by chance.
what is one of the biggest randomness psychological problems
imagine a larger hospital that has 45 births/day and a smaller one with 15 births/day. Then, ask yourself, in a given year, which hospital will have more days which more than 60% of the babies born are boys? The answer is the small hospital.
what is the explanation to this problems answer
this is because small samples are more variable than large samples. When you have a small sample, the proportion of boys you observe bounces around more. The variation is more extreme
In a large sample the proportions are more stable, you get closer to 50% most of the time, the variation is damped down
what is this problem called and explained
the law of small numbers → the tendency to expect small samples to be as stable and representative as large samples, to treat a few observations as if they reliably reveal the underlying truth. People treat small samples as if they were large samples
what is the replication crisis
he finding that many published results in psychology and related sciences cannot be reproduces by independent researchers, reflecting false positives, inflated effects or context specific findings rather than robust, general truths.
This is often because old research used smaller samples and were underpowered resulting in extreme findings that were often just noise due to sampling variability , whereas, the newer studies use much larger samples.
what are the three questions you must ask when seeing a striking result
how large is the sample → if small (>50) be skeptical
has this been replicated → if it is the only study and small, be skeptical.
how large is the effect → if it is very big but a small sample, be skeptical
a finding that you should be excited about should be one where the effect is small to moderate but observed in a large sample
what are the four problems a scientist who believes in the law of small numbers would practice
gambles his research hypotheses by overestimating his power
has undue confidence in early trends (ie. the data of the first few subjects) and in the stability of observed patterns → overestimates significance
has unreasonably high expectations about the replicability of significant results
rarely attributed a deviation of results from expectations to sampling variability
what best describes the fundamental goal of most psychological research
To make claims about populations by drawing inferences from samples — because populations of interest are rarely directly observable
A researcher measures anxiety scores in a sample of 80 university students and finds a mean of 54. What is the relationship between this result and the population mean?
The sample mean of 54 (x̄) is an estimate of the unknown population mean (μ) — the true average anxiety in the broader population the study is intended to speak to
what is the definition of sampling variability
The fact that sample statistics — such as the mean — differ from one sample to the next, even when the underlying population and study method remain the same
using the dartboard analogy, what would an estimate that is precise but not accurate look like?
Darts that cluster tightly together in one area of the board, but that area is nowhere near the bullseye
a confidence interval is described as a way of communicating what property of a sample estimate?
The range of plausible values for the true population parameter, given the data — communicating both the estimate and the uncertainty surrounding it
A researcher argues: "I used a validated scale, carefully trained all my research assistants, and ran every participant in a controlled environment. Therefore my sample mean reflects the true population mean." What is wrong with this reasoning?
Precision of measurement is not the same as precision of estimation — no matter how carefully data are collected, the sample mean still carries uncertainty because samples vary from one another
what is the correct definition of a sampling distribution?
The distribution of a sample statistic — such as the mean — across all possible samples of a given size drawn from the same population
The Central Limit Theorem is described as what?
For sufficiently large samples, the sampling distribution of the mean will be approximately normal, regardless of the shape of the underlying population distribution
what does the standard error (SE) of the mean measure?
How much the sample mean would vary across repeated samples of the same size — the standard deviation of the sampling distribution
A sample of 400 participants has a standard deviation of 16. The standard error of the mean is therefore 0.8. What does the SD and SE tell you?
The SD of 16 describes how much individual scores vary around the sample mean; the SE of 0.8 describes how precisely the sample mean estimates the true population mean
to halve the standard error of the mean, a researcher needs to do what to the sample size?
Quadruple it — because SE = SD / √N, halving SE requires doubling √N, which means multiplying N by four
A researcher reports: "We found a mean difference of 4.2 points, 95% CI [2.1, 6.3]." What key information does the confidence interval add that the p-value alone does not provide?
The direction, magnitude, and precision of the effect — the CI shows how large the effect is and how tightly it has been estimated
a 95% confidence interval is constructed by taking the sample mean and adding and subtracting approximately two standard errors. What is the statistical justification for this?
In a normal distribution, approximately 95% of values fall within two standard deviations of the mean — so because the sampling distribution is approximately normal with a standard deviation equal to the SE, roughly 95% of possible sample means fall within two SEs of the true population mean
what is the central argument to Geoff Cumming’s 2014 paper?
Effect sizes and confidence intervals should be the centrepiece of every result — reported as standard rather than optional — because they communicate direction, magnitude, and precision in ways that a p-value alone cannot
what is the most common misconception about sampling distributions
Confusing the histogram of raw scores from a single sample with the sampling distribution — the histogram shows what one sample's scores looked like; the sampling distribution is the theoretical distribution of means across many hypothetical repetitions
probability is defined as a long-run frequency. What does this mean?
Probability is not a property of any single event but describes the proportion of times an outcome occurs across a very large number of repetitions — it is a property of a process, not of one trial
what describes null world?
An imagined world in which nothing is happening — there is no true effect, no group difference, and no relationship — so any observed pattern is purely the result of sampling variability
how does the hot hand fallacy differ from the gambler's fallacy?
Whereas the gambler's fallacy incorrectly expects a streak to reverse, the hot hand fallacy incorrectly treats a streak as evidence of an ongoing real change in probability — such as believing a basketball player on a scoring run is genuinely more likely to score next
sampling variability is described as "sampling error" but it is cautioned that the word "error" should not be read as "mistake." Why is this so
Sampling error is the technical term for the inevitable deviation of a sample statistic from the population parameter that arises because any sample is a finite, random selection — it is not caused by poor methodology
a mindfulness researcher runs the same study four times with different participants, getting treatment effects of 7, 6, 8, and 6 points respectively. The true effect is approximately 7 points. What does this variation across studies illustrate?
Sampling variability — the inevitable fluctuation of sample statistics around the true population value when different participants are drawn from the same population
According to Tversky & Kahneman (1971), what is the central claim of the "law of small numbers" as they use the term?
People — including trained researchers — expect small samples to be as representative of the population as large samples, systematically underestimating how much more variable small-sample results are
a large hospital delivers about 45 babies per day; a small hospital delivers about 15. Which will record more days per year on which more than 60% of births are boys? What is the correct answer, and why?
The small hospital — because proportions vary more in small samples, making extreme outcomes (like 10+ boys out of 15) much more common than extreme outcomes in larger samples (like 28+ boys out of 45)
small samples produce more extreme results than large samples. What is the mathematical explanation?
The standard error is SD divided by the square root of N — with small N, the denominator is small, so the standard error is large, meaning sample means bounce widely around the true population value
a researcher with 20 participants finds a 20-point anxiety reduction after a CBT intervention, while a replication with 200 participants finds only a 5-point reduction. Which study is more informative about the true effect, and why?
The larger replication — because with 200 participants the standard error is much smaller, so the sample mean is likely to be close to the true effect; the 20-person study could easily produce extreme values just by chance
there is a connection between small-sample research to the replication crisis. Explain it.
Many published findings came from studies that were underpowered — small samples make extreme results common, so striking findings that appeared real simply reflected sampling variability and failed to hold in larger, more reliable replications
when you encounter a striking result in a published study, what are the three questions to ask to help evaluate how much to trust it.
How large is the sample? Has the finding been independently replicated? How large is the reported effect size?
a study of N=8 finds a doubling of memory performance; a replication of N=100 finds a 10–15% improvement. Which result gives more confidence in the reality of the effect, and what does the original study's striking result most likely represent?
The replication with N=100 — the 10–15% improvement in a large sample reliably estimates the true effect; the original doubling in 8 people likely reflected a noise-amplified outlier result rather than the true effect size
what is the detective analogy to explain the logic of hypothesis testing
You assume the null hypothesis is true — that nothing is happening — and ask how probable your observed data would be under that assumption; if the data are very improbable under the null, you reject it
The null hypothesis (H₀) is always a claim about what?
The population — it asserts that the true effect size is zero, meaning no difference, no relationship, and no effect exists in the world the study is sampling from
what is the p-value
The probability of observing data as extreme as or more extreme than the obtained result, given that the null hypothesis is true
alpha (α) is a decision threshold. If α = .05 and the null hypothesis is actually true, what does this threshold mean?
You will incorrectly reject the null hypothesis approximately 5% of the time across many studies — this is a Type I error, or false positive
A researcher obtains p = 0.03 and announces: "There is only a 3% chance this result occurred by chance, so there is a 97% probability our hypothesis is correct." What error is being made?
The researcher is committing the inverse probability fallacy — the p-value gives the probability of the data given the null, not the probability that the null or the hypothesis is true given the data