715 all vocab

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/77

There's no tags or description

Looks like no tags are added yet.

Last updated 10:47 PM on 6/30/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai	Chat

No analytics yet

Send a link to your students to track their progress

78 Terms

New cards

ACID

Properties that ensure reliable database transactions: Atomicity, Consistency, Isolation, Durability.

New cards

Atomicity

A transaction happens completely or not at all. ALL OR NOTHING

New cards

Consistency

A transaction moves the database from one valid state to another.

New cards

Isolation

Transactions do not interfere with one another.

New cards

Durability

Committed data remains even after a crash.

New cards

CAP Theorem

A distributed system can guarantee only two of the following three:

Consistency
Availability
Partition Tolerance

New cards

OLTP

Processes many small, real-time transactions.

New cards

OLAP

Analyzes large amounts of historical data for reporting and decision-making.

New cards

SQL Database

A relational database that stores data in tables with fixed schemas.

New cards

NoSQL Database

A non-relational database designed for flexible, scalable storage of structured or unstructured data.

New cards

Throughput

How much work gets done

New cards

Latency

Waiting time. Time required to get a response

New cards

Horizontal Scaling

Adds more servers to increase capacity. (growth is faster/ foundation of distributed systems and NoSQL)

New cards

Vertical Scaling

Adds more CPU, RAM, or storage to one server.

New cards

Replication

Creates copies of data on multiple servers for reliability and availability.

New cards

Sharding

Splits data across multiple servers to improve scalability.

New cards

Hadoop

A framework for storing and processing large datasets across many computers.

New cards

HDFS

Hadoop Distributed File System for storing large datasets across multiple machines

New cards

Map

Definition:
Processes input data into key-value pairs.

New cards

Reduce

Combines intermediate results into a final output.

New cards

Spark

A distributed computing framework that processes data primarily in memory for high speed.

New cards

ETL

Extract, Transform, Load.

Data is transformed before loading.

New cards

ELT

Extract, Load, Transform.

Data is transformed after loading.

New cards

Data Warehouse

Stores structured, cleaned data for analytics.

New cards

Data Lake

Stores raw structured and unstructured data.

New cards

Graph

A data structure made of nodes connected by relationships.

New cards

Node

An entity in a graph.

New cards

Relationship (Edge)

A connection between two nodes.

New cards

Property

Information stored on a node or relationship.

New cards

Neo4j

A native graph database.

New cards

Cypher

Neo4j's query language.

New cards

Index-Free Adjacency

Nodes directly reference neighboring nodes, enabling fast graph traversal

New cards

BFS (Breadth-First Search)

A graph traversal algorithm that explores level by level using a queue and finds shortest paths in unweighted graphs.

New cards

DFS (Depth-First Search)

A graph traversal algorithm that explores as deep as possible using a stack (or recursion).

New cards

Dijkstra's Algorithm

Finds the shortest path in a weighted graph with non-negative edge weights using a priority queue.

New cards

Queue

FIFO (First-In, First-Out) data structure

New cards

Stack

LIFO (Last-In, First-Out) data structure.

New cards

Priority Queue

A queue that removes the item with the highest priority (lowest distance in Dijkstra).

New cards

Data Stream

A continuous, real-time, potentially infinite flow of data.

New cards

Event-Time

The time an event actually occurred.

New cards

Processing-Time

The time the system processes an event.

New cards

Watermark

A mechanism that tracks stream progress and helps handle late events.

New cards

Tumbling Window

A non-overlapping fixed-size window.

New cards

Sliding Window

An overlapping fixed-size window.

New cards

Session Window

A window based on user activity that closes after inactivity.

New cards

Record-by-Record Processing

Processes each event immediately as it arrives.

New cards

Micro-Batching

Processes small batches of events at regular intervals.

New cards

Stateful Processing

Maintains information across multiple events.

New cards

CEP (Complex Event Processing)

Detects meaningful patterns across multiple events.

New cards

Lambda Architecture

A three-layer architecture combining batch and stream processing.

New cards

Kappa Architecture

A single stream-processing architecture that replays event logs

New cards

IoT Pipeline

Processes IoT data through collection, preprocessing, processing, and visualization.

New cards

IaaS

Infrastructure as a Service; provides virtual hardware while users manage the operating system and applications.

New cards

PaaS

Platform as a Service; users deploy applications while the provider manages the platform.

New cards

SaaS

Software as a Service; complete software provided over the internet.

New cards

FaaS (Serverless)

Function as a Service; executes individual functions on demand without managing servers.

New cards

Stateless

Does not retain information between executions.

New cards

Ephemeral

Temporary; exists only while running.

New cards

Auto-Scaling

Automatically adjusts computing resources based on demand.

New cards

Cold Start

Initialization delay when an inactive serverless function runs again.

New cards

Storage Disaggregation

Separates compute resources from storage so each can scale independently.

New cards

LLM

A Large Language Model trained on massive text datasets to generate human-like text.

New cards

Transformer

The neural network architecture used by modern LLMs.

New cards

GPT

Generative Pre-trained Transformer; predicts the next token.

New cards

Token

A piece of text processed by an LLM.

New cards

Embedding

A numerical vector representation of data.

New cards

Attention

A mechanism that focuses on relevant surrounding tokens to understand context.

New cards

Vector Database

A database that stores and searches embeddings for similarity search.

New cards

Semantic Search

Search based on meaning instead of exact keywords.

New cards

Cosine Similarity

Measures similarity using the angle between vectors.

New cards

Euclidean Distance

Measures straight-line distance between vectors.

New cards

KNN

Exact nearest-neighbor search that compares against every vector.

New cards

ANN

Approximate nearest-neighbor search that is faster but slightly less accurate.

New cards

HNSW

An Approximate Nearest Neighbor (ANN) search algorithm.

New cards

Chroma

ANN open-source vector database.

New cards

Pinecone

A cloud-native vector database

New cards

RAG (Retrieval-Augmented Generation)

A technique that retrieves relevant documents from a vector database before an LLM generates an answer, improving accuracy and reducing hallucinations.

New cards

Hallucination

An incorrect or unsupported answer generated by an LLM.