1/121
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Population
A large group of elements about which we wish to make an inference
Parameter
A number that summarizes a characteristic of the population
Sample
A subset of population selected for analysis
Statistic
A number that summarizes some characteristic of the sample
Variable
A dimension on which quantifiable change exits
Ex: Biological sex, mood, race, and income
Nominal Scale
Levels are descriptive nonnumeric categories without an ordered relationship
Ex: biological sex and race
Ordinal Scale
Levels are numeric quantities with a specific order but no clear uniform spacing between them
Interval Scale
Levels are numeric quantities with a specific order and equal spacing
Ratio Scale
An interval scale with a true 0 point
Discrete Variable
Has a finite or countably infinite number of values
Ex: Number of symptoms and number of smiles per week
Continuous Variable
Infinite number of possible values between each value
Frequency Table
Displays how frequently each level of a variable appeared in your sample
Ex: How many males there were, how many 85's there were on a test
Cumulative Frequency
The sum of the frequencies for that value or lower
Percentile Rank
The sum of the percentages of that value and lower
Grouped Frequency Table
Same thing as a frequency table but for intervals, so how often that interval appeared
Bar Graph
Frequency distribution which is plotted horizontally with spaces between them, usually used for nominal scales
Histogram
Values or intervals of a numeric value (usually continuous), are plotted horizontally, basically a bar graph without spacing
Xᵢ
An individual score of variable "X"
µᵪ
Population mean of variable "X"
X̅
Sample mean of variable "X"
∑
Sum of values across a set of values
Npop
Population Size
N
Sample Size
Central Tendency
A representative, or typical value. A single number that represents the general location of a set of scores.
Mean
Indicator of Central Tendency, the average
Median
Indicator of Central Tendency, the middle value
Mode
Indicator of Central Tendency, the most commonly occurring variable
Formula for population Mean
μᵪ = ∑Xᵢ / Npop
Formula for sample Mean
X̅ = ∑Xᵢ / N
Positive Skewness
A small number of extreme highs pull the mean above the median
Negative Skewness
A small number of extreme lows pulls the mean below the median
Variability
How spread out the scores are around the mean
Range
Indicator of variability, difference between the highest and lowest values
Variance
The average squared deviation from the mean
σ²ᵪ
Population variance
S²ᵪ
Sample Variance
Standard Deviation
The average difference between a score and the mean
σᵪ
Population Standard Deviation
Sᵪ
Sample Standard Deviation
Deviation score
Xᵢ - μᵪ
Squared Deviation score
(Xᵢ - μᵪ )²
Sum of the Squared Deviations
∑(Xᵢ - μᵪ )²
Variance Formula
Variance ∑(Xᵢ - μᵪ )²/Npop
Standard Deviation
√(∑(Xᵢ - μᵪ )²/Npop)
Sample Based Estimate of Population Variance
∑(Xᵢ - X̅)²/N-1
Sample Based Estimate of Population Standard Deviation
√(∑(Xᵢ - X̅)²/N-1)
Frequency Distribution
The pattern of frequencies across the various levels of a variable
Unimodal Distribution
One value, or one interval of values, is higher
Bimodal Distribution
Two values, or intervals of values, are higher than the others and are at similar heights
Symmetrical Distribution
Same number of scores above and below the mean, and the right and left sides are mirror images
Skewed Distribution
A distribution that isn't symmetrical where more scores are on one side and other side contains extreme scores
Normal Distribution
Unimodal, symmetrical, and mesokurtic (bell-shaped)
Z Score (Standard Score)
A type of transformed of a numeric variable expressed in terms of a distribution that has a mean of 0 and a SD of 1, the unit of measurement is the SD, positive scores are above average and negative are below average
Population Z-Score Formula
zXᵢ = (Xᵢ - μᵪ)/σᵪ
Sample Z-Score Formula
zXᵢ = (Xᵢ - X̅)/Sᵪ
Frequentist Interpretation of Probability
The expected relative frequency of a target outcome
Bayesian Interpretation of Probability
An estimate of the likelihood of a target outcome
Central Limit Theorem
If each observation in a set of observations is determined by a large number of random factors, the distribution will likely approach a normal distribution
Percentage of Observations that falls within 1 SD from the Mean in a Normal Curve
68.2%
Percentage of Observations that falls between 1 and 2 SDs from the Mean in a Normal Curve
27.2%
Percentage of Observations that falls outside of 2 SDs from the Mean in a Normal Curve
4.6%
In a normal curve where do 95% of scores fall?
1.96 SDs
Standard Normal Curve
A normal curve with a mean of zero and a SD of 1
Sampling Distribution of the Mean, or S.D.M., μx̅
The distribution of means from repeated samples of size N drawn from a population, always converges to the Population Mean
μx̅ = μx
The mean of the S.D.M. equals the population mean
σx̅
standard deviation of the sampling distribution of x̅, A.K.A. the Standard Error (SE)
σx̅ < σx
The variability of the distribution of sample means is less than the population of variability on x
Standard Error Formula
σx̅ = σx/√N
Population Distribution, Sample Distribution, and S.D.M.
1. μᵪ = ∑Xᵢ / Npop -> √(∑(Xᵢ - μᵪ )²/Npop)
2. X̅ = ∑Xᵢ / N -> √(∑(Xᵢ - X̅)²/N-1)
3. μx̅ = μx -> σx̅ = σx/√N
Null Hypothesis
Statement that one's research hypothesis is not true, if the null is rejected you're hypothesis is upheld
P-value
Probability that if the null is true, one would have obtained a sample statistic as or more extreme as what one obtained in one's sample
Z-Test
Procedure for testing hypothesis that a target population from which you've drawn a sample differs from another population with a known mean and variability, assuming both populations have the same variance
Cutoff for rejecting the null
+-1.96
Zstat formula
(X̅-μx̅)/σx̅
T-test
Statistical test comparing means between two groups, used when the population standard deviation is unknown, to calculate this we need to find the estimated sample error, you test the null
Estimated sample error formula
SX̅ = Sx/√N
Degrees of freedom
N-1, represents the number of independent values in a dataset that are free to vary when estimating statistical parameters
Tstat formula
(X̅-μx̅)/SX̅
DDM, Distribution of Difference Between Means
Distribution of differences between the means of pairs of samples such that, for each pair of samples, one is from one population and the other is from another population
Estimated Standard Error of the DDM
The estimated standard deviation of the sampling distribution for the difference between two independent sample means
Distribution of Difference Between Means Formula
(X1-X2)/SX1-X2
Confidence Interval
A range of values, derived from sample data, that is likely to contain the value of an unknown population parameter
Confidence Interval Formula
(X̅1-X̅2) +/- (SX̅1-X̅2 * Tcrit)
T-test for matched samples
Looking for the difference score where each variable is 2 samples
D
Difference score
Formula
td̅ = (D̅-µD̅)/SD̅ = D̅/SD̅
SD̅
Standard Deviation of Difference Score
SD̅ formula
SD/√N
Confidence Interval for difference score
D̅ + / - (SD̅ * tcrit)
Margin of error
(XXX * tcrit)
Quasi-Independent Variable
A variable not controlled by the experimenter. Instead participants already belong to the groups being studied. It is also called a subject variable, selected variable, or grouping variable. Ex: Gender
Deciles
Deciles divide a distribution into 10 equal parts. Each part represents 10% of the data. The 9th decile marks the point below which 90% of the scores fall, so the top 10% lies above the 9th decile.
Quartiles
Quartiles divide a distribution into 4 equal parts. Each part represents 25% of the data. The 2nd quartile is the 50th percentile, which is also the median.
Frequency Polgyon
A graph similar to a histogram. Instead of bars you have plot points representing frequencies at each score or class interval and then connect the points with straight lines, it is useful for showing the overall shape of a distribution
Stem-and-leaf display
A Stem and leaf display organizes data in a way that combines features of a frequency distribution and a histogram. The stems represent the leading digits or class intervals, and the leaves represent the individual values within each stem. It is useful because it shows the shape of the distribution while still keeping the original data values visible.
Weighted Mean
A weighted mean, or weighted average, is a mean in which some values count more than others. Each score is multiplied by it’s weight, usually it’s frequency or importance. Example: GPA. Letter grades are assigned numerical values, and each grade is weighed by the number of credit hours or courses receiving that grade.
Separate-Variance t Test
A separate-variance t test compares the means of two independent groups when the assumption of equal variances is not met. It is especially useful when the groups have different variances and/or different sample sizes. This test produces a more accurate p-value and confidence interval and helps reduce inflated Type 1 error rate.
Matched-Pairs Design
A matched-pairs design is an experimental design in which participants are placed into pairs based on similar characteristics, such as age, gender, or pretest score. One member of each pair receives the treatment, and the other receives a different measure or control. This design helps reduce confounding variables, increase precision, and allow for smaller sample sizes.
Point estimate
single numeric value that is your best estimate of the population parameter
Linear Correlation
Association between 2 variables in which dots on a scatter diagram roughly follow a straight line