1/132
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Logistic Regression
Predicts binary outcomes like yes/no or 0/1. (Like deciding if an email is spam.)
Decision Tree
Splits data into branches based on rules. (Like asking "Is it raining?" then deciding to bring an umbrella.)
Random Forest
Collection of decision trees voting together. (Like asking multiple friends for advice.)
Overfitting
Model memorizes data instead of learning patterns. (Like memorizing practice questions, not concepts.)
Cross-Validation
Tests a model with different train/test splits. (Like practicing a speech with multiple audiences.)
Statistics
Study of collecting, analyzing, and interpreting data. (Like summarizing a classroom's test scores.)
Probability
Likelihood of an event occurring between 0 and 1. (Like flipping a coin and predicting heads.)
Hypothesis Testing
Determines if a result is due to chance or a real effect. (Like testing if a new study method works.)
Correlation
Measures how strongly two variables move together. (Like linking hours of sleep to energy levels.)
Linear Regression
Predicts a numeric value using a straight-line relationship. (Like estimating house price by size.)
Feature Engineering
Creating new variables to improve models. (Like combining height and weight into BMI.)
Regularization
Penalizes large weights to prevent overfitting. (Like discouraging overcomplicated answers.)
PCA
Reduces data dimensions while keeping patterns. (Like turning a long movie into a short trailer.)
Python
Versatile language for data science and ML. (Like a Swiss Army knife for coding.)
R
Language for statistics and visualization. (Like Excel but programmable.)
SQL
Queries and manages structured databases. (Like asking a librarian for specific books.)
Bash/Shell
Automates commands and tasks. (Like giving your computer a to-do list.)
Java
Common backend and big-data language. (Like a sturdy foundation for big apps.)
NumPy
Fast numerical computation in Python. (Like a high-speed calculator.)
pandas
Data organization and analysis in Python. (Like coding with spreadsheets.)
matplotlib
Makes static charts. (Like drawing graphs with code.)
seaborn
Stylish, statistical visualizations. (Like decorating charts.)
scikit-learn
Library for ML algorithms. (Like a toolbox for prediction.)
TensorFlow
Deep learning framework by Google. (Like building neural nets with LEGO.)
PyTorch
Flexible deep learning library by Meta. (Like an experimental lab for AI.)
Keras
Simplified deep learning interface. (Like a friendly front-end for TensorFlow.)
Statsmodels
Statistical modeling in Python. (Like R's regression functions.)
SciPy
Scientific computing and optimization. (Like a scientific calculator.)
XGBoost
Gradient boosting library for fast ML. (Like turbocharging predictions.)
LightGBM
Lightweight, fast boosting model. (Like XGBoost's speedy cousin.)
CatBoost
Boosting library that handles categorical data. (Like XGBoost that understands text labels.)
Data Cleaning
Fixing or removing incorrect or missing data. (Like tidying a messy spreadsheet.)
Missing Value Imputation
Filling blanks with averages or predictions. (Like guessing skipped test answers.)
Outlier Detection
Finding extreme or unusual values. (Like spotting a runner at 100 mph.)
One-Hot Encoding
Turning categories into 0/1 columns. (Like checkboxes for ice cream flavors.)
Label Encoding
Assigning numbers to categories. (Like small=1, medium=2, large=3.)
Normalization
Scaling data to 0-1. (Like matching different music volumes.)
Standardization
Centering data to mean 0, std 1. (Like grading on a curve.)
ETL
Extract, Transform, Load process. (Like shopping, washing, and storing groceries.)
API Data Retrieval
Pulling data via web APIs. (Like ordering takeout and getting a meal back.)
Web Scraping
Collecting data from websites. (Like copying all prices from a store automatically.)
Data Integration
Combining sources into one view. (Like merging grades and attendance into one record.)
Data Transformation
Changing data format or shape. (Like rearranging ingredients before cooking.)
SQL Databases
Store structured tables of data. (Like labeled filing cabinets.)
NoSQL Databases
Handle flexible, unstructured data. (Like a box of sticky notes with loose rules.)
MySQL/PostgreSQL
Popular SQL database systems. (Like reliable file cabinets.)
MongoDB
NoSQL document database. (Like digital folders with flexible fields.)
Hadoop
Distributed framework for large data storage. (Like splitting a huge book among friends.)
Spark
Engine for fast distributed processing. (Like Hadoop's faster cousin.)
Databricks
Collaborative platform around Spark. (Like Google Docs for big-data code.)
Airflow
Automates and schedules data workflows. (Like a calendar that runs scripts for you.)
Kafka
Real-time data streaming system. (Like a conveyor belt for continuous updates.)
Snowflake
Cloud data warehouse. (Like an online filing cabinet with unlimited space.)
BigQuery
Google's serverless warehouse. (Like querying billions of rows instantly.)
AWS S3
Amazon's cloud storage. (Like an infinite online drive.)
ETL Pipelines
Automated data movement processes. (Like an assembly line for cleaning and storing data.)
Tableau
Interactive dashboards for storytelling. (Like turning spreadsheets into visual stories.)
Power BI
Microsoft tool for business reports. (Like Tableau inside the Office suite.)
matplotlib
Static Python charting library. (Like drawing with precise tools.)
seaborn
Statistical visualization library. (Like adding style to graphs.)
Plotly
Interactive plotting library. (Like hoverable, zoomable charts.)
ggplot2
R library for layered plots. (Like building art with data.)
Looker
Cloud BI and data exploration tool. (Like real-time dashboards for teams.)
D3.js
JS library for web visuals. (Like coding interactive art in a browser.)
Excel/Sheets Charts
Basic built-in graphing. (Like sketching drafts before full dashboards.)
AWS
Cloud platform for computing, ML, and storage. (Like renting virtual computers.)
AWS S3
Cloud storage for any data type. (Like an online hard drive.)
AWS EC2
Virtual servers for running apps. (Like borrowing a supercomputer.)
AWS Lambda
Serverless code execution. (Like lights turning on automatically.)
AWS SageMaker
ML development platform. (Like a full lab for model training.)
GCP
Google's cloud service. (Like running apps on Google's servers.)
BigQuery
Serverless SQL data warehouse. (Like analyzing data at lightning speed.)
Vertex AI
Managed ML platform by Google. (Like an AI workshop that sets itself up.)
Azure
Microsoft's cloud environment. (Like Windows in the cloud.)
Azure ML Studio
Drag-and-drop ML builder. (Like Lego blocks for models.)
Azure Synapse
Data integration and analytics. (Like joining multiple lakes into one.)
Docker
Packages apps and dependencies. (Like a lunchbox that works anywhere.)
Kubernetes
Manages many Docker containers. (Like an orchestra conductor.)
Flask
Lightweight Python web framework. (Like a simple café website in code.)
FastAPI
High-speed API framework. (Like Flask but faster.)
Streamlit
Turns Python scripts into web apps. (Like instant dashboards.)
MLflow
Tracks ML experiments and models. (Like a lab notebook for ML runs.)
MLOps
Managing ML model lifecycle. (Like DevOps for data models.)
CI/CD
Automates testing and deployment. (Like a conveyor belt for code.)
MLflow
Tracks experiments and versions. (Like logging every experiment.)
Kubeflow
Runs ML pipelines on Kubernetes. (Like an automated assembly line.)
DVC
Version control for data/models. (Like saving checkpoints in a game.)
Git/GitHub
Version control and collaboration. (Like Google Docs for code.)
Jenkins
Automates build/test pipelines. (Like a robot that checks every change.)
Airflow
Schedules and monitors data tasks. (Like autopilot for workflows.)
Docker
Packages ML environments. (Like shipping your model in a safe box.)
Kubernetes
Scales and manages containers. (Like traffic control for servers.)
Excel
Spreadsheet for data analysis. (Like a digital notebook of formulas.)
Google Sheets
Collaborative spreadsheets. (Like shared Excel in the cloud.)
Tableau
Visual dashboards for insights. (Like colorful stories from data.)
Power BI
Microsoft BI dashboards. (Like Tableau with Excel integration.)
Google Data Studio
Free Google dashboard builder. (Like linking Analytics and Sheets visually.)
SAS
Enterprise analytics platform. (Like R + Python for corporations.)
SPSS
Statistical tool for research. (Like point-and-click regression.)
Alteryx
No-code data prep and blending tool. (Like puzzle pieces that snap together.)