Quantitative Methods Final

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/68

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:53 PM on 5/4/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

69 Terms

1
New cards

Correlation

strength and direction relationship between 2 variables

2
New cards

Univariate

summarizes 1 variable at a time

3
New cards

Bivariate

compares 2 variables (correlation & linear regression)

4
New cards

Multivariate

compares 2+ variables (cluster analysis, PCA, & correspondence analysis)

5
New cards

Models used for correlation

scatterplots & general linear model

6
New cards

What types of data for correlation?

continuous/rank normally distributed data

7
New cards

Pearson’s R

correlation coefficient that measures the strength and direction of linear relationships for normal, interval data (-1 to 0 to 1 scale)

  • beware outliers, must be linear

8
New cards

Covariance

indicates the direction of a linear relationship

9
New cards

deviations

differences between observed values and the mean

10
New cards

standardization

converting variables’ units of measurement to be able to compare by using standard deviation

11
New cards

standard deviation

standardized unit of measurement used to show how far away values are from the mean

12
New cards

partial correlation

quantifies the relationship between two variables while controlling the effect of one variable

13
New cards

Correlation Tests

  1. Pearson’s R

  2. Spearman’s rho

  3. Kendall’s tau

14
New cards

Spearman’s rho

correlation coefficient test used for non-parametric ranked data, but not good for tied ranks

15
New cards

Kendall’s tau

correlation coefficient test used for non-parametric ranked data and is better for tied ranks

16
New cards

Linear Regression

predicts how the independent variable influences the dependent variable to change as well as the relationship between the variables

17
New cards

Steps of Linear Regression

  1. identify dependent and independent variables

  2. plot cases in scatterplot

  3. fit a line to the points

18
New cards

What types of data for correlation, linear regression, and PCA?

continuous (interval/ratio)

19
New cards

Slope intercept formula

y = mx + b

  • y = dependent variable

  • m = slope

  • x = independent variable

  • b = y-intercept

20
New cards

Least square regression line

line determined through method of least squares that passes through the means of x & y values and minimizes the distance between itself and other points

21
New cards

Method of Least Squares

used to estimate parameters of the model and the line that best fits the data

22
New cards

simple regression

outcome variable is predicted from 1 variable

23
New cards

multiple regression

outcome variable is predicted from multiple variables

24
New cards

residuals

like deviations but for linear regression; assesses model fit by looking at the differences between the observed values and regression line

25
New cards

Coefficient of Determination

output for linear regression; percentage of variance in the dependent that can be explained by the variance in the independent variable (0 to 1)

26
New cards

ANOVA Regression

  1. sum of squared differences (model fit)

  2. significance for regression (p-value)

  3. F-value/F-ratio (Between-group variance/within group variance; > 1)

27
New cards

R

degree of correlation

28
New cards

R²

percentage of the total variation is able to be explained

29
New cards

Why linear regression?

  1. allows quantification of the relationship between 2 variables

  2. allows you to make predictions when you only have independent variables based on a known relationship

  3. explore exceptional cases

30
New cards

Cluster Analysis

groups sets of objects so that objects in groups (cluster) are more similar to each other than objects in other groups (can use all data types)

31
New cards

Methods of Cluster Analysis

  1. Hierarchical Cluster Analysis

  2. Non-Hierarchical Cluster Analysis

32
New cards

Hierarchical Cluster Analysis

clusters are formed during every step of the process and as a new case is entered, it is grouped into a larger cluster as well and the output is not sorted in a linear order

33
New cards

Dendrogram

tree branch plot used in cluster analysis

34
New cards

Steps for Cluster Analysis

  1. choose variables you want

  2. decide whether to use raw or standardized variables

  3. choose coefficient that quantifies the similarity or dissimilarity between all cases

  4. select method for forming clusters

35
New cards

Cluster Analysis Coefficients

  1. Euclidean Distance

  2. City Block Metrics

  3. Jaccard Coefficient

  4. Simple Matching Coefficient

36
New cards

Euclidean Distance

coefficient method for cluster analysis that uses Pythagorean theorem to measure a straight line distance from one point to another for continuous data

37
New cards

City Block Metric

coefficient method for cluster analysis that makes an x, y grid and moves one axis at a time to measure how many units away for large continuous data

38
New cards

Jaccard Coefficient

coefficient method for cluster analysis that counts negative matches as weighed differences and is best for presence/absence data

39
New cards

Simple Matching Coefficient

coefficient method for cluster analysis that counts negative matches as weighing the same and is best for presence/absence data

40
New cards

Methods for forming clusters

  1. Simple Linkage

  2. Average Linkage

  3. Complete Linkage

  4. Ward’s Procedure

41
New cards

Simple Linkage

method for forming clusters in cluster analysis that uses the closest variables to draw a line

42
New cards

Average Linkage

method for forming clusters in cluster analysis that uses the shortest distance between centroids to draw a link

43
New cards

Complete Linkage

method for forming clusters in cluster analysis that uses the furthest distance from the edge of each cluster

44
New cards

Ward’s Procedure

method for forming clusters in cluster analysis that analyzes the variance of the clusters and is best for quantitative data

45
New cards

Types of Hierarchical Clustering

  1. Agglomerative

  2. Divisive

46
New cards

Agglomerative Hierarchical Clustering

uses bottom up approach that repeatedly merges clusters into larger ones until a single cluster emerges (similarity based on proximity using Euclidean distance) and is more commonly used because of its ease to implement

47
New cards

Divisive Hierarchical Clustering

top down approach that starts from one group, then splits until more clusters are created and is better for large data sets

48
New cards

Rules for how Divisive clusters are formed

  1. Monothetic

  2. Polythetic

49
New cards

Monothetic

rule for forming divisive clusters where decisions are made based on one variable at a time

50
New cards

Polythetic

rule for forming divisive clusters where clusters are made based on multiple variables at a time

51
New cards

Non-Hierarchical Cluster Analysis

uses some measure to evaluate whether or not a case should be in a cluster by merging or splitting clusters instead of putting them into a hierarchical order and is better for small data sets

52
New cards

Simple K-means

ensures that non-overlapping groups that have no hierarchical relationships between them

53
New cards

Principal Component Analysis

a way to identify clusters of variables by reducing the number of dimensions in large datasets down to its principal components that still retain most of the original data (continuous); tries to explain the maximum amount of total variance in a correlation matrix by transforming the original variables into its smaller linear components (correlation & variance)

54
New cards

Variance

spread of data

55
New cards

Principal Components

lines that describe the relationship between 2 variables and is predicted from measured variables

56
New cards

R matrix

table that arranges the correlation between each pair of variables

57
New cards

What type(s) of data for correspondence analysis?

categorical/count

58
New cards

Steps of PCA

  1. cases are plotted in multi-dimensional space

  2. find where the data is most spread out

  3. identify where the center of the spread of points is

59
New cards

Criteria of a Principal Component Line

  1. must pass through center of data

  2. must pass through spread of data along the axis that will capture the most variation

  3. all consecutive lines must be drawn at right angles to the prior line

60
New cards

Evaluating PCA/Eigenvectors for significance/meaning

  1. eigenvalues

  2. scree plot

  3. component loading

61
New cards

What kind of plot for correlation, linear regression, PCA, and correspondence analysis?

scatterplots

62
New cards

Eigenvalues

values derived from eigenvectors that measure the distance from one end of the matrix to another in order to understand the distribution of variance (how much variation along that dimension of data is described; >1)

63
New cards

Eigenvector

measures height and width of ellipse encompassing the data in the scatterplot

64
New cards

Scree Plot

plot that elbows where a lot of variance is accounted for/eigenvalues level off

65
New cards

Component Loading

breaks down principal components to understand correlations between original variables and unit-scaled components (greater # = stronger correlation)

66
New cards

Correspondence Analysis

plots (scatterplots/biplots) different categories in multidimensional space and breaks it down into 2 dimensional space to understand the similarities and why they are similar (expected vs observed)

67
New cards

Residual

like error for correspondence analysis; describes how the relationship between 2 dependent variables is influenced by individual differences in participant’s performance (difference between model prediction and value observed)

68
New cards

Inertia

degree to which values of rows and columns correspond to each other in correspondence analysis (chi-square/n)

69
New cards

Steps of Correspondence Analysis

  1. compute averages for each row and column

  2. compute the expected values for each cell

  3. compute residuals for each cell

  4. divide the residuals by the expected values

  5. plot indexed residuals in 2 dimensions