1/33
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
LDA assumptions
normality and same covariance matrix
QDA assumptions
normal, different variance/covariance matrix
Gaussian Naive Bayes assumptions
normal, different variance , no covariance
how to check assumptions graphically
qqplot
side by side boxplot
covariance ellipse
perspective and contour plot
scatter plot
how to check assumptions with tests
boxm
mvn
kolomogrov- smirnov
shapiro-wilk
correlation test
LDA bias
could have high bias when covariance matrix is different
naive bias
reduces variance but could have large bias
naive bias posterior probability
sum of the functions of the input features . GAM generalized additive model you essentially add together the effects of each variable
log odds posterior probability LDA
linear in X
log odds posterior probability QDA
quadratic
log odds posterior probability naive bias
Generalized additive model
log odds posterior probability logistic
linear of x
which two have decision boundaries
LDA and multinomial regression (logistic)
When all covariance matrices are the same (quadratic term is 0) what is LDA a special case of
QDA
Special case of naive bias
LDA and multinomial
which for continuous predictors
LDA and QDA
categorical predictors
Multinomial Regression, Naive Bayes
high value for LD1 indicates
most group seperation happens along that axis
Do you need to specify a k value for Hierarchical clustering
no
why called hierarchal clustering ?
clusters obtained by cutting the dendrogram at given height are nested within the clusters obtained by cutting any higher.
linkage
dissimilarity between clusters with multiple observations
four type of linkage
average, completed, single and centroid
preferred linkage
average and completed
what is hierarchal clustering based on
the distance matrix typically for numerical data
large number of variables
k-methods
structure k-methods vs hierarchal
unstructured hierarchal is more interpretable and informative
easier to determine number of clusters in
hierarchal clustering dendrogram
distinguishes based on prior beliefs
hierarchal clustering may be used to know the number of clusters
Specific number of clusters but the group they belong to is unknown
k-methods
is clustering robust
not robust to perturbations to the data
complete linkage
looks at distance between points in two clusters and pick the largest one.
single linkage
looks at the distance between points in two clusters and picks the smallest one.
average linkage
calculates the average distance between all pairs of points in two cluster
centroid linkage
compares the central points of two clusters, ignoring how spread out the individual points are