1/73
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Relational database
A database that stores data in structured tables with rows and columns, using keys to define relationships between tables
Non-relational database
A NoSQL database that stores data in formats other than tables — such as documents, key-value pairs, or graphs
.csv
Comma-separated values; a plain text file format for storing tabular data where each value is separated by a comma
.xlsx
Microsoft Excel spreadsheet file format used for storing structured data with formulas and formatting
.json
JavaScript Object Notation; a lightweight semi-structured file format that stores data in key-value pairs
.txt
A plain text file with no special formatting; stores raw character data
.jpg
JPEG image file; a compressed raster image format commonly used for photographs
.dat
A generic data file used to store raw binary or text data; format depends on the application
Structured data
Data organized in a predefined format of rows and columns — typically stored in relational databases or spreadsheets
Semi-structured data
Data that does not follow a strict table format but has some organizational properties — such as JSON or XML
Unstructured data
Data with no predefined format or structure — such as images, videos, emails, and social media posts
Fact table
The central table in a star schema that stores measurable, quantitative data (metrics/events) about business transactions
Dimensional table
A table in a schema that stores descriptive attributes (dimensions) used to provide context for facts
Slowly changing dimension
A dimensional table attribute that changes gradually over time — such as a customer's address or job title
Bridge table
A table used to resolve many-to-many relationships between fact tables and dimension tables
Schema
The logical structure or blueprint that defines how a database is organized, including tables, columns, and relationships
JSON (JavaScript Object Notation)
A semi-structured, human-readable data format using key-value pairs that supports nested structures
Nested structure
A data structure within JSON where objects or arrays are contained inside other objects or arrays
char
A fixed-length string data type — always uses the defined number of characters (e.g., a 2-char state code like "CA")
varchar
A variable-length string data type — uses only the space needed up to a defined maximum length
nvarchar
A variable-length Unicode string type — supports characters from multiple languages and character sets
Null value
Represents the complete absence of a value — not zero, not empty string — truly unknown or missing data
Spatial data type
A data type used to store geometric or geographic data such as coordinates, shapes, and map locations
Boolean data type
A data type that holds only two values — TRUE or FALSE
Integer
A whole number data type with no decimal places (e.g., 1, 42, -7)
Decimal
An exact numeric data type with a fixed number of decimal places — commonly used for financial data
Float
An approximate numeric data type that stores numbers with decimal points using floating-point representation
Timestamp
A datetime value that records a specific moment in time — often includes date, time, and timezone
BLOB (Binary Large Object)
A data type for storing large binary data — such as images, audio, video, or documents — in a database
CLOB (Character Large Object)
A data type for storing large amounts of text or character data in a database
GUID / UUID
A 128-bit identifier that is globally or universally unique — used to identify records across different systems
API (Application Programming Interface)
A set of rules that allows different software systems to communicate, request, and share data with each other
Data lake
A centralized repository that stores large volumes of raw, unprocessed data in its native format — structured or unstructured
Data lakehouse
A hybrid architecture combining the raw storage scalability of a data lake with the structure and query performance of a warehouse
Data mart
A subset of a data warehouse focused on a specific business unit or department — such as sales or finance
Data silo
Isolated data stored separately from the rest of an organization — inaccessible to other teams or systems
Data warehouse
A centralized repository of structured, historical data optimized for querying, reporting, and business intelligence
AWS (Amazon Web Services)
A leading public cloud provider offering computing, storage, databases, and analytics services
Azure
Microsoft's cloud computing platform providing infrastructure, AI, analytics, and database services
Private cloud
Cloud infrastructure operated solely for one organization — either on-premises or hosted exclusively by a third party
Public cloud
Cloud infrastructure owned by a third-party provider and shared across multiple organizations
Hybrid cloud
A mixed cloud environment that combines private and public cloud resources
Object storage
Stores data as discrete objects with metadata — ideal for unstructured data at scale (e.g., AWS S3)
File storage
Organizes data in a hierarchical folder/file structure — common in NAS (Network-attached Storage) systems
Block storage
Splits data into fixed-size blocks — used for databases and high-performance applications (e.g., SAN)
Local storage
Storage physically connected to or located on a single machine or device
Shared storage
Storage accessible by multiple users or systems simultaneously across a network
Containerization
Packaging an application and all its dependencies into an isolated container for consistent deployment across environments
IDE (Integrated Development Environment)
A software application combining a code editor, debugger, and build tools in one interface
RStudio
An IDE designed specifically for R programming — used for statistical computing and data analysis
VS Code (Visual Studio Code)
A lightweight, versatile IDE by Microsoft that supports Python, R, and many other languages
Tableau
A business intelligence tool used to create interactive dashboards and data visualizations
Power BI
Microsoft's BI tool for building reports and dashboards from multiple data sources
Looker
A Google-owned BI and data exploration platform for creating visualizations and data-driven applications
Anaconda
A Python/R distribution that includes Jupyter Notebook and common data science libraries pre-installed
pandas
A Python library for data manipulation and analysis — provides DataFrame structures for working with tabular data
tidyverse
A collection of R packages (including ggplot2 and dplyr) designed for data science workflows
SAS
A statistical software suite and programming language used for advanced analytics and data management
R
An open-source programming language designed for statistical computing and data visualization
Python
A versatile general-purpose programming language widely used in data analysis, automation, and machine learning
Scala
A programming language often used with Apache Spark for large-scale distributed data processing
SQL Server Management Studio (SSMS)
A Microsoft tool for managing and querying SQL Server databases
MySQL Workbench
A visual GUI tool for designing, managing, and querying MySQL databases
MongoDB Compass
A GUI tool for MongoDB used to explore, query, and visualize NoSQL document data
DBeaver
A free, universal database management tool that supports multiple database types
Toad
A database management tool primarily used for Oracle and SQL databases
Azure Data Studio
A cross-platform database tool by Microsoft for SQL Server, Azure SQL, and other data platforms
Generative AI
AI that can generate new content — such as text, images, or code — based on patterns learned from training data
LLM (Large Language Model)
A type of generative AI model trained on massive text datasets to understand and generate human language
Foundational model
A large-scale AI model trained on broad data that can be fine-tuned and adapted to a wide variety of tasks
Deep learning
A subset of machine learning using multi-layered neural networks to recognize complex patterns in large datasets
NLP (Natural Language Processing)
A branch of AI that enables computers to understand, interpret, and generate human language
RPA (Robotic Process Automation)
Technology that uses software robots to automate repetitive, rule-based business tasks such as automated reporting