ADA Summary Test

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/67

flashcard set

Earn XP

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

68 Terms

1
New cards
Deviance/ Error
The distance of each score from the mean
2
New cards
Sum squared errors
The total amount of error in the mean (the errors/deviances are squared before adding them up)
3
New cards
Variance
The average distance of scores from the mean. (it is the sum of squares divided by the number of scores) Tells us how widely dispersed scores are around the mean.
4
New cards
Standard Deviation
The square root of the variance
5
New cards
Z-score
The sign tells us if the original score was above or below the mean, the value tells us how far the score was from the mean in sd units. (z-score = (score - mean of all scores)/ standard deviation of all scores)
6
New cards
Probability theory
Uses language of sets
7
New cards
Sets
A collection of things/ elements
8
New cards
Universal Set
is the set of all things that we could possibly consider in the context of what we are studying (S={1,2,3,4,5,6} --> for a dice)
9
New cards
Function
A rule that takes an input from a specific set, called the domain and produces an output from another set, called co-domain
10
New cards
Sample Space
The set of all possible outcomes
11
New cards
Range
A function as the set containing all the possible values of f(x). Thus the range of a function is always a subset of its co-domain
12
New cards
Mutually exclusive Independence
--> Mutually exclusive events cannot be statistically independent, since knowing that one occurs gives information about the other (specifically, that it certainly does not occur)
--> can’t happen at the same time
--> If A and B are mutually exclusive events they are statistically independent if and only if P(A)=0 or P(B)=0 or both are zero
13
New cards
The law of large numbers
The higher the numbers of trials the closer we get to the true probability
14
New cards
Central limit theorem
framework so we can do statistical inference, as the sample size increases becomes more and more like a normal distribution
15
New cards
Sample Distribution
When you take the average of the sample averages, it will look like your population mean because it is a normal distribution we can come up with p-values
16
New cards
Descriptive statistics
Summarize the characteristics of a data set (The data with regards to average mean, median, range)
17
New cards
Inferential statistics
Allows you to test a hypothesis or asses whether your data is generalizable to the broader population.

- takes data from a sample and makes inferences about the larger population from which the sample was drawn.
- goal of inferential statistics is to draw conclusions from a sample and generalize them to a population
- draw conclusion from the sample, happens when calculating p values
18
New cards
Normal Distribution
99,7% of sample results are contained within 3 standard errors
95% within 2 standard errors
68% within 1 standard error
19
New cards
Standard error
= standard deviation of a sampling distribution
20
New cards
Null hypothesis
Null hypothesis rejected when sample statistic falls in the rejection region/s. (In hypothesis testing, we start by assuming the null hypothesis is true.)

When 95% and P
21
New cards
Confidence Level
probability between the 2 rejection regions for a two tailed test. If α=0,10, then the Confidence Level is 1-0,10 = 0,90 or 90%
22
New cards
Confidence Interval
The bounds equal to the lower and upper critical values (The area (region can measure anything))
23
New cards
Type 1
False positive (reject a true null hypothesis)
24
New cards
Type 2
False negative (accept a false null hypothesis)
25
New cards
non parametric test (distribution free test)
does not assume anything about the underlying distribution (for example, that the data comes from anormal distribution and does not have a normal distribution)
26
New cards
parametric test
makes assumptions about a population’s parameters(for example, the mean or standard deviation)
27
New cards
One tailed test
When you want to know if something is simply higher or lower
28
New cards
What test should be used with unequal variances?
Welch's ANOVA
29
New cards
ANOVA
The analysis of the variances, it tells you if there is a difference between at least 2 of the groups, not which groups are different from another.
30
New cards
Total sum of squares (TSS or SST)
tells you how much variation there is in the dependent variable, it is a measure of how a data set varies around a central number (like the mean)
31
New cards
Sum of squares
the main goal is to see if there is any overlapping, just like variance
32
New cards
Between Sum of Squares Between Sum of Squares (a.k.a. Explained/Model/Treatment) (SSB)
the explained Sum of Squares tells you how much of the variation in the dependent variable of your model is explained.
33
New cards
Residual (Error) Sum of Squares (within sum of squares) (SSE)
tells you how much of the dependent variable’s variation your model did not explain. It is the sum of the squared differences between the actual Y and the predicted Y (observed vs expected)
34
New cards
F-distribution (use)
We use an F-distribution when we are studying the ratio of the variances of two normally distributed populations
35
New cards
F-test
The further the groups are from the grand mean, the larger the variance in the numerator becomes. In our F-test, this corresponds to having a higher variance in the numerator.
36
New cards
F-ratio
=MSB/MSE
37
New cards
What type of Anova? (1 grouping variable)
one-way ANOVA
38
New cards
What type of Anova? (Another grouping variable)
two-way anova
39
New cards
What type of Anova? (factorial ANOVA)
three-way Anova
40
New cards
ANOVA Assumptions
1. Check Assumptions
a. Normality (Sharpiro-Wilk)
b. Outliers (BoxPlots)
2. Run one-way ANOVA with post-hoc
a. Tukey & Games Howell
b. Levene’s Test= do the distribution for each group, looking almost the same, is there homogeneity?
3. Run GLM to check partial eta squared
a. Estimates of effect size
4. Calculate omega-squared
5. Interpret the data
41
New cards
Eta square (Eta^2)
How good that measures the outcome; how much does my model explain the total variance in the observations; how much does it explain of the total variation (= SSbetween / SStotal)
42
New cards
Omega square
less biased alternative measure of the how good your model explains the results (especially when sample size is small)
43
New cards
Factoiral ANOVA
= we are examining how much of the variance in our data can be explained by our independent variables (>1)
= it looks at the main effects of the PV and their interaction effect on the OV
=a (name) with 2 PVs is a two-way ANOVA, etc.
44
New cards
When do we use a Factoiral ANOVA
a) OV (outcome variable) = quantitative
b) PV= categorical
c) Independent groups (between-subject design)
d) Variance is homogenous across groups (similar in shape)
e) (Residuals (actual obs. to the average) are normally distributed)
If you have an interaction effect= there is dependency
45
New cards
Moderation
It is a way to check whether that third variable influences the strength or direction of the relationship between independent and dependent.
46
New cards
Mediator
= mediated the relationship between independent and dependent; explains the reason for such a relationship to exist
Is the influence of mediator stronger than the influence of the direct independent variable (imagine you have grades lead to happiness and self-esteem is the mediator; what you want to do with mediation= we try to see if the variable “self-esteem” explains the existence of the “grades” variable completely)
47
New cards
Correlation
= measures the degree of a relationship between two variables (x and y)
= find the numerical value that shows the relationship between the two variables and how they move together
48
New cards
Regression
= analysis helps to determine the functional relationship between two variables (x and y) so that you’re able to estimate the unknown variable to make future projections on events and goals

= to estimate the values of a random variable (z) based on the values of your known (or fixed) variables (x and y).
= is considered to be the best fitting line through the data points
49
New cards
Pearson 'r'
= measures the strength of the linear relationship between two quantitative variables.
= is always a number between -1 and 1. r > 0 indicates a positive association.
50
New cards
R squared
Coefficient of determination= tells you SSB (SSM)/ SST= tells you whether your model, how much the variability in the outcome is explained by the model; indicator on how good your linear model is/ proportion of variability in your outcome that is explained by the model

= shows how well the data fit the regression model (the goodness of fit)
= The higher the better
51
New cards
Bootstrap Procedure (non-parametric)
1. Choose a number of bootstrap samples to perform
2. Choose a sample size
3. For each bootstrap sample
4. Draw a sample with replacement with the chosen size
5. Calculate the statistic on the sample
6. Calculate the mean of the calculated sample statistics
52
New cards
Simple linear regression
represented by: y = β0 +β1x+ε
β0 --> y-intercept
β1 --> slope
E(y) --> is the mean or expected value of y for a given value of x
53
New cards
Adjusted R-squared
A modified version of R-squared that has been adjusted for the number of predictors in the model. The adjusted R-squared increases when the new term improves the model more than would be expected by chance. It decreases when a predictor improves the model by less than expected. Typically, the adjusted R-squared is positive, not negative. It is always lower than the R-squared.
54
New cards
Standardized coefficient beta
= Standardize (we use it; use apples & apples – just brings everything to the same base) = we want to compare between two variables (for example 0.467 has a bigger effect than 0.146); which has a bigger predictive effect
ď‚ľ Use it when we want to compare effect sizes across PV
ď‚ľ Easier to compare
55
New cards
Unstandardized beta
= we want to figure out the exact relation/ predictive relation between the variables and our outcome
The math aptitude test scores for every unit increase in that, we can see at point 0.116 increase in our statistics exam results= the actual outcome that happens

ď‚ľ If you want to interpreted individual PVs impact on the OV
ď‚ľ Easier to interpret
56
New cards
Orthogonalization
= refers to axes being at a right angle
= in moderation we need it to fix the distorting effect of multicollinearity (increasing standard errors and decreasing the t-statistic)
In factor analysis we also make use of orthogonalization when we rotate the factors because all the multidimensional axes have to be at a right angle to form the factor/component
57
New cards
Tolerance
= an indication of a percent of variance in the predictor that cannot be accounted for by the other predictors; meaning that very small values indicate that a predictor is redundant
58
New cards
Dummy variables
=in statistics and econometrics, particularly in regression analysis, a dummy variable is one that takes (converts – main goal) only the value 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome.
=dummy variable is a dichotomous, but a dichotomous is not necessarily a dummy variable

In ANOVA, the dependent variable was continuous
ď‚ľ Independent variables can be dichotomous (dummy variables), but not the dependent variables
59
New cards
Odds
= probability of success/ probability of failure
60
New cards
Multivariate statistical methods
= the joint behavior of more than one random variable
61
New cards
Goal of PCA (principal components analysis)
= reduce the number of dimensions that we have;
We decide on the 2 or 3 variables (15 variables) we do that by Scree Plot

We started in the survey with 15 questions, but we don’t really know what variable they are measuring, so PCA will help us to see if any underline variables that we cannot see just by looking at the data
62
New cards
Principal Components & Factor Analysis
= reduce dimensionality of the problem to better understand the underlying factors affecting those variables
63
New cards
Factors
= Linear combination (variate) of the original variables. Factors also represent the underlying dimensions (constructs) that summarize or account for the original set of observed variables. Factors are a type of latent (hidden/ underlying/ its hidden somewhere there, but you don’t know yet) variable.
It is a variable that is dependent on any other variables
64
New cards
Factor loading
= correlation between variables (it is the SWLS1)
A question is a variable
65
New cards
Communality
= another word from R-square (how much does you PV explain the OV variance)
66
New cards
Eigenvalue
% of variance explained * the total number of variables (throw away eigenvalue below 1)
67
New cards
Covariance
= involves 2 dimensions; think of it like correlation
68
New cards
Correlation (R)
= can be calculated covariance of 2 dimensions/ SD of X * SD of Y