Comprehensive Study Guide: Python, Data Analytics & Machine Learning

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/35

flashcard set

Earn XP

Description and Tags

These flashcards cover key concepts, terms, and definitions related to Python, Data Analytics, and Machine Learning as outlined in the comprehensive study guide.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

36 Terms

1
New cards

Data Types

int, float, string, boolean.

2
New cards

List Characteristics

Ordered, mutable, allow duplicates, index starts at 0.

3
New cards

Dictionaries

Key–value pairs used in Python.

4
New cards

Function Parameters vs Arguments

Parameters are in function definition; arguments are values passed.

5
New cards

Python Case Sensitivity

Python is case sensitive.

6
New cards

Negative Indexing

Counts from the end of a list.

7
New cards

df.head(n) function

Returns first n rows of a DataFrame.

8
New cards

df.iloc[:n] function

Returns rows from index 0 to n-1.

9
New cards

DataFrame Mean Calculation

df['column'].mean() calculates the average.

10
New cards

Pandas Import

Pandas is NOT included in base Python and must be imported.

11
New cards

Supervised Learning

Involves labeled data for classification and regression.

12
New cards

Unsupervised Learning

Involves unlabeled data for clustering and dimensionality reduction.

13
New cards

Binary Classification

Predicts between two classes.

14
New cards

Multi-class Classification

Involves more than two mutually exclusive classes.

15
New cards

Multi-label Classification

Allows multiple labels per observation.

16
New cards

Classification Predictions

Predicts categories, not continuous values.

17
New cards

Accuracy in Model Evaluation

Measures overall correctness; weak for imbalanced data.

18
New cards

Precision

Correctness of positive predictions.

19
New cards

Recall

Ability to identify actual positives.

20
New cards

F1-Score

Balance of precision and recall.

21
New cards

R² in Regression

Proportion of variance explained.

22
New cards

RMSE

Penalizes large errors; same units as the target.

23
New cards

MAE

Average absolute error; less sensitive to outliers.

24
New cards

KNN

A supervised learning method relying on Euclidean distance.

25
New cards

K-Means Clustering

An unsupervised clustering method reliant on Euclidean distance.

26
New cards

Elbow Method & Silhouette Score

Help determine the optimal number of clusters (K).

27
New cards

Time Series Analysis Purpose

Used to forecast values over time.

28
New cards

Time Series Components

Trend, Seasonality, Residual (Irregular).

29
New cards

Additive Model in Time Series

Seasonal magnitude is constant.

30
New cards

Multiplicative Model in Time Series

Seasonal magnitude is proportional to trend.

31
New cards

Forecasting Requirements

Requires lagged values and time-based predictors.

32
New cards

R² and Forecasting

High R² does NOT guarantee good forecasts.

33
New cards

CRISP-DM Framework

Steps include Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment.

34
New cards

Four V's of Big Data

Volume, Velocity, Variety, Veracity.

35
New cards

Visualization Purposes

Used for exploratory analysis and storytelling.

36
New cards

Exam Strategy Tips

Match business questions to modeling approaches; precise terminology; watch for default behaviors in Python and Pandas.