1/35
These flashcards cover key concepts, terms, and definitions related to Python, Data Analytics, and Machine Learning as outlined in the comprehensive study guide.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Data Types
int, float, string, boolean.
List Characteristics
Ordered, mutable, allow duplicates, index starts at 0.
Dictionaries
Key–value pairs used in Python.
Function Parameters vs Arguments
Parameters are in function definition; arguments are values passed.
Python Case Sensitivity
Python is case sensitive.
Negative Indexing
Counts from the end of a list.
df.head(n) function
Returns first n rows of a DataFrame.
df.iloc[:n] function
Returns rows from index 0 to n-1.
DataFrame Mean Calculation
df['column'].mean() calculates the average.
Pandas Import
Pandas is NOT included in base Python and must be imported.
Supervised Learning
Involves labeled data for classification and regression.
Unsupervised Learning
Involves unlabeled data for clustering and dimensionality reduction.
Binary Classification
Predicts between two classes.
Multi-class Classification
Involves more than two mutually exclusive classes.
Multi-label Classification
Allows multiple labels per observation.
Classification Predictions
Predicts categories, not continuous values.
Accuracy in Model Evaluation
Measures overall correctness; weak for imbalanced data.
Precision
Correctness of positive predictions.
Recall
Ability to identify actual positives.
F1-Score
Balance of precision and recall.
R² in Regression
Proportion of variance explained.
RMSE
Penalizes large errors; same units as the target.
MAE
Average absolute error; less sensitive to outliers.
KNN
A supervised learning method relying on Euclidean distance.
K-Means Clustering
An unsupervised clustering method reliant on Euclidean distance.
Elbow Method & Silhouette Score
Help determine the optimal number of clusters (K).
Time Series Analysis Purpose
Used to forecast values over time.
Time Series Components
Trend, Seasonality, Residual (Irregular).
Additive Model in Time Series
Seasonal magnitude is constant.
Multiplicative Model in Time Series
Seasonal magnitude is proportional to trend.
Forecasting Requirements
Requires lagged values and time-based predictors.
R² and Forecasting
High R² does NOT guarantee good forecasts.
CRISP-DM Framework
Steps include Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment.
Four V's of Big Data
Volume, Velocity, Variety, Veracity.
Visualization Purposes
Used for exploratory analysis and storytelling.
Exam Strategy Tips
Match business questions to modeling approaches; precise terminology; watch for default behaviors in Python and Pandas.