Variability, Normal Distributions, Hypothesis Testing, Power, Confidence Intervals
Reasons for Variability
Differences in data, differences in procedures, your own mistakes
Descriptive Statistics
Used to summarize and describe sets of data Ex. A GPA summarizes grades
Inferential Statistics
Used to make predictions or inferences about a population based on a sample of data. Ex. hypothesis testing and confidence intervals
Sample
A subset of a population
Random Samples
Samples are selected randomly from a population to minimize bias
Random Assignment
Randomly assigning people you selected into groups to minimize confounding variables
Population
A group that we want to learn more about that shares a common chacteristic
Category Scale
Observations are assigned different categories with no numerical value Ex. Careers
Ordinal Scale
orders different observations but does not tell us distance between points Ex. Class Rank
Equal Interval Scale
A measurement scale where the distance between any two adjacent points is the same. Ex. Temperature in Celsius, a clock
Absolute Zero Scale
A scale that indicates the complete absence of a quantity, where zero represents a true zero point. Ex. Kelvin temperature scale.
Nominal scale
qualitative scale, a form of categorical scale
Interval Scale
Spacing is known, but zero doesn’t matter
Ratio Scale
A quantitative scale that has a true zero point. Ex. height, weight, and temperature in Kelvin.
Characteristics of a positively skewed distribution
In a positively skewed distribution, most data points cluster to the left with a long tail extending to the right. This indicates that the mean is greater than the median which is greater than the mode.
Characteristics of a negatively skewed distribution
In a negatively skewed distribution, most data points cluster to the right with a long tail extending to the left. This indicates that the mean is less than the median which is less than the mode.
Bimodal distribution
A probability distribution with two different modes or peaks.
Characteristics of a normal distribution
A probability distribution that is symmetric about the mean, unimodal, and asymptotic or tails go to infinity. Mean=median=mode
range
largest value - smallest value
Standard deviation
A measure of the amount of variation or dispersion in a set of values, indicating how much individual data points differ from the mean. This is the square root of variance
Indication of population
Greek lettering
Central tendency
mean, median, mode
mean
Average
median
middle most point
mode
Most frequently occurring data point
68% of all data points in a normal distribution are…
within 1 standard deviation of the mean
95% of all data points in a normal distribution…
are within 2 standard deviations of the mean
99.7% of all data points are in a normal distribution…
are within 3 standard deviations of the mean
z-score
How your score compared to others in a normal distribution in terms of standard deviations (this is a standard score)
How to compute a z-score
(raw score - mean)/ standard deviation
mean of a set of z scores
0
standard deviation of a set of z-scores
1
shape of a set of z-scores
normal
Percentile
proportion of people with scores less than or equal to a score. Correlates to z-score Ex. 85th percentile means 85% of scores are below yours
The same person can have different z-scores based on the sample they are a part of
True
To calculate frequency between two z scores
Convert raw scores to z scores and find each value’s z table value. Subtract these values from each other
Probability in terms of statistics
Using frequency to predict what will happen in the future
Why would you do random sampling with replacement
so each observation has an equal chance of being picked every time and the probability doesn’t change
What is a distribution of sample means
All possible samples that could be picked from a population. Samples are different from population because of sampling error
The distribution of sample means is not normal
false. It is normal
Standard deviation of sample means
standard error of sample means (standard deviation/ sqr. root of n) (tells us how well a sample mean estimates the population mean)
what is the z score when the alpha level is .05
1.96
what is the z score when the alpha level is .01
2.567
what is the z score for a 90% confidence interval
1.645
Reject the null
significant evidence to suggest test effected population mean
fail to reject the null
No significant evidence to suggest test effected the population mean
Type 1 error
null is rejected when treatment really doesn’t have an effect (a false positive)
Type 2 error
False reporting that the treatment has no effect (it does) (false negative)
A type 1 error is worse than a type 2
True
How to minimize type 1 error
increase sample size and increase level of significance
power
probability of correctly rejecting null hypothesis
factors that effect power
level of significance, sample size, effect size
small, medium, large effect sizes
.2, .5, .8
What does a larger effect size mean
there is a smaller number of people required to reach desired level of significance
Confidence Interval
Where you’re confident the population mean falls between
Factors that effect width of a confidence interval
Standard error and level of significance
Can you do hypothesis testing if the distribution is skewed
NO
The range is not a particularly useful measure of variability
True because it ignores so much of the data
A positive z score always indicates that the raw score is located above the mean
True
Variability in the sampling distribution of the mean can be decreased by decreasing the sample size
False because sampling error can only be decreased by increasing the sample size
Rejecting a true null hypothesis
a type 1 error
The smaller the alpha level, the lower the risk of…
a type 1 error
The median provides a more representative measure of the central tendency than does the mean in a highly skewed distribution
True
It is impossible for a distribution to have more than one mode
false
The average difference between each score and the mean is a very useful measure of variability
False
A sample standard deviation tends to be ____ the population standard deviation
less than
In a normal distribution, the probability of selecting a score more than 1 SD above the mean is about…
.16
In a skewed distribution, the probability of selecting a score above the mean is
impossible to determine
A one tailed test is better than a two tailed test
false
Failing to reject a false null
type 2 error
Rule of thumb in psych for minimum acceptable probability that a null hypothesis is false
95% (1.96- z score)
Probability is ____ in future tense
frequency
When do you use a t test
when you don’t know a population’s standard deviation and we estimate it
what is the degrees of freedom for a single sample t test
n-1
the smaller the size of the sample, the better s represents sigma, and the better the t distribution approximates a normal distribution
false
s is almost indistinguishable from sigma when N is over 60
false
there are many different t distributions
true
when sigma is not known, we must point estimate its value
true
what do you do if your t statistic falls within the rejection region
reject the null and conclude there was an effect
once you know s, we can compute the sample mean (sM) by
dividing by the square root of n
what is the main difference between a t distribution and a normal distribution
they have broader tails
if one calculated the effect size using a single sample t test and found an effect size of .5, what would that mean
the mean is half the standard deviation from the population
what does a t distribution depend on
degrees of freedom
what should you avoid if you’re worried about carry over effects
dependent sampling
what is the combined variance of two groups in an independent t test called
pooled variance
what is the degrees of freedom in a dependent t test
n-1 (n is number of pairs)
when computing both independent and dependent t tests, one must estimate the standard error
true
is a dependent t test always non directional
no
what is an advantage of dependent sampling vs independent sampling
it is more powerful
what does D stand for in a dependent t test
the difference between scores from the same subjects from time one and time two
if you suspect individual differences in your sample will be large, use what test
dependent sample t
what are the main sources of carry over effects
practice, fatigue, transparency, subject mortality
what factors effect the estimated standard error in an independent t test
the variability of the scores and the size of the sample
ANOVA can only be used when comparing independent groups
false
a negative f ratio implies that differences were detected, but that they were in the opposite direction of what was predicted
false
what does a significant f ratio tell us in ANOVA
at least one mean differs from one other mean, but not which one
what is the denominator of the f ratio
MSwithin
SStotal = ?
SSbetween + SSwithin
the shape of the f distribution depends on the degrees of freedom
true
when do you use a proteted t test
after you find a significant f ratio