1/15
These flashcards cover essential concepts related to applied artificial intelligence, focusing on clustering, classification, regression metrics, model evaluation, and common machine learning algorithms.
Name  | Mastery  | Learn  | Test  | Matching  | Spaced  | 
|---|
No study sessions yet.
What is the definition of clustering in machine learning?
Clustering is an unsupervised machine learning technique that creates clusters of observations that share certain characteristics.
What distinguishes classification from clustering?
Classification is a supervised learning technique that uses labeled data to predict categories, while clustering groups unlabeled data into meaningful observations.
Name a few metrics for measuring model accuracy in classification.
Accuracy, Precision, Recall, F1-Score, ROC AUC.
What is the purpose of the Adjusted R² statistic in regression analysis?
Adjusted R² penalizes the model for adding unnecessary predictors, providing a more reliable fit quality.
Explain Mean Absolute Error (MAE).
MAE is the average of the absolute differences between actual values and predicted values, indicating the typical deviation in predictions.
Describe the significance of the R² statistic in regression models.
R² measures the proportion of variability in the dependent variable that can be explained by the independent variables.
What is the objective of logistic regression?
Logistic regression predicts a discrete output variable (class/category) from one or more predictor variables.
What are common applications of decision trees in machine learning?
Decision trees can be used for healthcare diagnoses, credit scoring, and predicting customer behavior.
Describe the role of feature engineering in machine learning.
Feature engineering transforms raw data into informative features that enhance model performance.
What is the confusion matrix used for?
The confusion matrix evaluates the performance of a classification model by showing the counts of true positives, false positives, true negatives, and false negatives.
What challenges do classification models face in real-world scenarios?
Classification models encounter issues like data drift, data quality problems, and changes in data distributions.
What are the advantages of using Random Forests in classification tasks?
Random Forests improve accuracy, reduce overfitting, and handle large datasets with high dimensionality.
What is one key characteristic of K-Means clustering?
K-Means clustering partitions data into a predefined number of clusters (k) by minimizing within-cluster variances.
How does Logistic Regression differ from Linear Regression?
Logistic Regression predicts discrete outcomes, while Linear Regression predicts continuous outcomes.
What does the F1-Score represent in model evaluation?
The F1-Score is the harmonic mean of precision and recall, providing a single metric that balances both concerns.
What is overfitting, and how can it affect model performance?
Overfitting occurs when a model learns the training data too well, leading to poor generalization on unseen data.