1/43
These flashcards cover the core vocabulary and concepts introduced across Chapters 1–10 of the lecture notes, including AI/ML/DL distinctions, data concepts, Python/Pandas/NumPy essentials, and ML modeling ideas. Use them to reinforce key terms and their definitions ahead of the exam.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Artificial Intelligence (AI)
A broad field of computer science focused on creating systems that exhibit intelligent behavior; includes rule-based systems, machine learning, and deep learning.
Machine Learning (ML)
A subset of AI that learns patterns from data to make predictions or decisions, using algorithms like linear regression, SVMs, and decision trees.
Deep Learning (DL)
A subset of ML based on artificial neural networks with many layers that learn hierarchical representations from data.
Generative AI
A form of AI focused on generating new data (e.g., text, images); often used colloquially to refer to models like GPT-based systems.
Turing Test
A test proposed by Alan Turing to determine if a machine can exhibit indistinguishable behavior from a human in conversation.
Dartmouth Conference (1956)
The conference that helped establish AI as a formal field of study.
Perceptron (1957)
One of the earliest neural network models; a fundamental building block for later neural networks.
Neural Network
A computing system inspired by the brain, consisting of interconnected units (neurons) that learn from data.
Ground Truth
The actual, correct label or outcome used to supervise or evaluate a model’s predictions.
Feature
A measurable property or input clue used by a model to make predictions.
Label
The target value or category that a model tries to predict (the ground truth).
Weights
Parameters that quantify the importance of each feature in predicting the output; higher weight means greater influence.
Bias (Intercept)
An offset term in a linear model that allows shifting the decision boundary; enables better fitting beyond passing through the origin.
Loss Function
A function that quantifies prediction error; used to guide the optimization process to improve model accuracy.
Gradient Descent
An optimization method that updates weights to minimize loss, using iterative steps toward smaller error (often described with a hot/cold analogy in class).
Garbage In, Garbage Out (GIGO)
The principle that poor quality input data or labels lead to poor model performance.
Overfitting / Overtraining
When a model learns noise and specifics of the training data too well, resulting in poor generalization to new data.
Python
The programming language used in the course for implementing ML pipelines.
Pandas
A Python library for data manipulation and analysis, centered on DataFrame and Series structures.
DataFrame
A Pandas data structure for storing tabular data with labeled axes (rows and columns).
DataFrame.head()
A Pandas function that displays the first few rows of a DataFrame (default 5).
DataFrame.info()
A Pandas function that shows data types, non-null counts, and basic info about a DataFrame.
DataFrame.describe()
A Pandas function that provides summary statistics (mean, std, min, max, percentiles) for numeric columns.
Display (Pandas)
A Pandas function that renders DataFrames in a readable, interactive format (especially in Jupyter).
read_csv
A Pandas function to read data from a CSV (Comma-Separated Values) file into a DataFrame.
read_excel
A Pandas function to read data from an Excel file (XLSX) into a DataFrame.
NumPy
A Python library for numerical computing, providing multi-dimensional arrays and operations on them.
Array
A core data structure in NumPy representing a grid of values (1D, 2D, etc.).
Matrix Multiplication
The operation of multiplying two matrices when their dimensions are compatible; in NumPy via dot or @ operator.
Zeros / Ones / Identity (NumPy)
Functions to create arrays of zeros, ones, or an identity matrix used for initializing computations.
Shape
The dimensions of a NumPy array (rows, columns); accessed via the shape attribute.
Indexing / Slicing
Accessing elements of an array or DataFrame by position (index) or by label; negative indices access from the end.
Random Seed
A value used to initialize a pseudo-random number generator to ensure reproducible results.
Jupyter Notebook / JupyterLab
Interactive environments for writing and executing Python, especially useful for data analysis and visualization.
Conda / Virtual Environments
Tools to create isolated Python environments with specific package versions to avoid conflicts.
Matplotlib Visualizations
A plotting library (often used with Pandas) for creating graphs like scatter plots, line plots, histograms, and more.
GroupBy / Apply (Pandas)
DataFrame operations for aggregating data by groups and applying custom functions to groups.
Supervised Learning
A type of machine learning where the model learns from a dataset with labeled examples (features and corresponding ground truth labels) to make predictions on new, unlabeled data.
Unsupervised Learning
A type of machine learning that works with unlabeled data to find hidden patterns, structures, or relationships within the data, such as clustering or dimensionality reduction.
Activation Function
A function that determines the output of a neuron in a neural network; it introduces non-linearity, allowing the network to learn complex patterns.
Backpropagation
An algorithm used to train neural networks by iteratively adjusting the weights of the network based on the gradient of the loss function with respect to those weights.
Training Set
The portion of the dataset used to train the machine learning model, where the model learns patterns and relationships.
Validation Set
A portion of the dataset used to tune hyperparameters and evaluate the model's performance during training, helping to prevent overfitting.
Test Set
A completely independent portion of the dataset used to evaluate the final performance of a trained model on unseen data, assessing its generalization ability.