1/22
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
AI (artificial intelligence)
AI is the broad field of making computers do things that normally require human intelligence—like problem-solving, learning, or understanding language. Machine learning and deep learning are part of AI.
Classic Machine Learning
Classic machine learning refers to the older, statistical, and algorithm-based techniques that dominated ML before deep learning took off. Examples: Linear Regression/Logistic Regression, Decision Trees/Random Forests, Naïve Bayes
Data Science
Data Science is the study of how we collect, organize, and analyze data to find patterns, answer questions, and make better decisions. It’s like being a detective for data — you take lots of messy information, clean it up, use math and computer tools to explore it, and then explain what it means in a way people can understand.
DataFrame
A DataFrame is a table of data (like a spreadsheet) used in Python, especially with the pandas library. It has rows and columns where you can store and organize data.
Pandas
Pandas is a Python library used to handle and analyze data. It helps you work with data tables (DataFrames), clean data, and do calculations easily.
NumPy
NumPy is a Python library that helps with math operations, especially when working with arrays (lists of numbers). It's super fast and useful in machine learning and scientific computing.
Scikit-learn
scikit-learn is a popular Python library for building machine learning models. It makes it easy to train models, test them, and use them for predictions.
Feature (X)
A feature is an input variable used to make predictions. It's something you know. In math, it's usually labeled X. Example: If you're predicting house prices, features could be square footage or number of bedrooms.
Target (y)
The target is what you're trying to predict. It's the answer you want the machine to learn. In math, it's usually called y. Example: If your features are house size and age, your target might be the house price.
Linear Regression
Linear regression is a type of machine learning that finds a straight-line relationship between inputs (X) and the output (y). Think of it like drawing the best-fit line through a scatter plot of points.
Coefficient (Weight and Bias)
Weight: A number that tells how important a feature is in predicting the target. Bias: A number that adjusts the line up or down. Together, they form the equation of the line: y = weight * x + bias.
Training Data
Training data is the part of your dataset that you use to teach your machine learning model. The model learns patterns from this data.
Validation Data
Validation data is a separate set of data used during training to check how well the computer is learning. It helps to tune and improve the program before final testing.
Testing Data
Testing data is used to check how well your trained model works. It’s data the model has never seen before, so it shows how the model performs in real life.
Mean Squared Error (MSE)
MSE is a number that shows how wrong your model’s predictions are. It calculates the average of all the squared differences between the predicted values and the actual values. Lower MSE = better model.
Loss (Minimizing Loss)
Loss is how far off the model’s predictions are from the real answers. The goal of training is to minimize the loss (make the error as small as possible).
Machine Learning
Machine learning is when a computer learns from data to make predictions or decisions—without being directly programmed for every task.
Learning Rate
The learning rate controls how fast the model updates itself while training. Too high = might miss the best solution. Too low = training takes forever.
Gradient Descent
Gradient descent is a method the computer uses to reduce error (loss) by adjusting weights and bias step-by-step. It's how the model learns.
Convergence
Convergence happens when training stops changing much because the model has found the best (or a good enough) solution. Think of it as “the learning has settled.”
Supervised Data
Supervised learning means the data comes with labels (correct answers). The model learns by comparing its predictions to the real answers.
Unsupervised Data
Unsupervised learning is when the data has no labels. The model looks for patterns or groups on its own, like sorting customers by behavior without knowing their names.