1/19
These flashcards cover key concepts related to types of data, terminology, methods of data manipulation in programming languages (R and Python), clustering methods, evaluation metrics, and essential linear algebra for data analysis.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Quantitative Data
Numeric data that can be divided into continuous or discrete types.
Continuous Data
Quantitative data that can take infinitely many values, such as height or time.
Discrete Data
Quantitative data that can only take countable values, such as the number of classes.
Qualitative Data
Categorical data that can be classified into nominal or ordinal types.
Nominal Data
Qualitative data that consist of unordered labels, such as eye color.
Ordinal Data
Qualitative data that consist of ordered labels, such as a satisfaction scale.
Data Frame
A rectangular table of data consisting of columns (variables) and rows (observations).
dplyr
An R package used for data manipulation with key functions such as select(), filter(), and mutate().
Accuracy
A metric defined as the ratio of correct predictions to total predictions.
Confusion Matrix
A table used to describe the performance of a classification model by comparing actual and predicted labels.
Feature Scaling
The process of standardizing or normalizing features to prevent distortion in distance-based methods.
Z-Score Standardization
A method of standardizing data such that the mean equals 0 and standard deviation equals 1.
Hierarchical Clustering
A method of cluster analysis which seeks to build a hierarchy of clusters through merging or splitting.
Dendrogram
A tree diagram that represents the sequence of merges or splits in hierarchical clustering.
k-Nearest Neighbors (KNN)
A non-parametric method for classification or regression that uses distance metrics to determine labels.
Manhattan Distance
A distance metric that calculates the sum of absolute differences between coordinates.
Cosine Similarity
A measure of similarity that calculates the cosine of the angle between two vectors.
Matrix Multiplication
The operation of multiplying two matrices where the inner dimensions must match.
Identity Matrix
A square matrix that, when multiplied by another matrix, outputs that matrix unchanged.