Data Concepts (20%) - Domain #1

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/73

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 1:27 AM on 6/23/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

74 Terms

1
New cards

Relational database

A database that stores data in structured tables with rows and columns, using keys to define relationships between tables

2
New cards

Non-relational database

A NoSQL database that stores data in formats other than tables — such as documents, key-value pairs, or graphs

3
New cards

.csv

Comma-separated values; a plain text file format for storing tabular data where each value is separated by a comma

4
New cards

.xlsx

Microsoft Excel spreadsheet file format used for storing structured data with formulas and formatting

5
New cards

.json

JavaScript Object Notation; a lightweight semi-structured file format that stores data in key-value pairs

6
New cards

.txt

A plain text file with no special formatting; stores raw character data

7
New cards

.jpg

JPEG image file; a compressed raster image format commonly used for photographs

8
New cards

.dat

A generic data file used to store raw binary or text data; format depends on the application

9
New cards

Structured data

Data organized in a predefined format of rows and columns — typically stored in relational databases or spreadsheets

10
New cards

Semi-structured data

Data that does not follow a strict table format but has some organizational properties — such as JSON or XML

11
New cards

Unstructured data

Data with no predefined format or structure — such as images, videos, emails, and social media posts

12
New cards

Fact table

The central table in a star schema that stores measurable, quantitative data (metrics/events) about business transactions

13
New cards

Dimensional table

A table in a schema that stores descriptive attributes (dimensions) used to provide context for facts

14
New cards

Slowly changing dimension

A dimensional table attribute that changes gradually over time — such as a customer's address or job title

15
New cards

Bridge table

A table used to resolve many-to-many relationships between fact tables and dimension tables

16
New cards

Schema

The logical structure or blueprint that defines how a database is organized, including tables, columns, and relationships

17
New cards

JSON (JavaScript Object Notation)

A semi-structured, human-readable data format using key-value pairs that supports nested structures

18
New cards

Nested structure

A data structure within JSON where objects or arrays are contained inside other objects or arrays

19
New cards

char

A fixed-length string data type — always uses the defined number of characters (e.g., a 2-char state code like "CA")

20
New cards

varchar

A variable-length string data type — uses only the space needed up to a defined maximum length

21
New cards

nvarchar

A variable-length Unicode string type — supports characters from multiple languages and character sets

22
New cards

Null value

Represents the complete absence of a value — not zero, not empty string — truly unknown or missing data

23
New cards

Spatial data type

A data type used to store geometric or geographic data such as coordinates, shapes, and map locations

24
New cards

Boolean data type

A data type that holds only two values — TRUE or FALSE

25
New cards

Integer

A whole number data type with no decimal places (e.g., 1, 42, -7)

26
New cards

Decimal

An exact numeric data type with a fixed number of decimal places — commonly used for financial data

27
New cards

Float

An approximate numeric data type that stores numbers with decimal points using floating-point representation

28
New cards

Timestamp

A datetime value that records a specific moment in time — often includes date, time, and timezone

29
New cards

BLOB (Binary Large Object)

A data type for storing large binary data — such as images, audio, video, or documents — in a database

30
New cards

CLOB (Character Large Object)

A data type for storing large amounts of text or character data in a database

31
New cards

GUID / UUID

A 128-bit identifier that is globally or universally unique — used to identify records across different systems

32
New cards

API (Application Programming Interface)

A set of rules that allows different software systems to communicate, request, and share data with each other

33
New cards

Data lake

A centralized repository that stores large volumes of raw, unprocessed data in its native format — structured or unstructured

34
New cards

Data lakehouse

A hybrid architecture combining the raw storage scalability of a data lake with the structure and query performance of a warehouse

35
New cards

Data mart

A subset of a data warehouse focused on a specific business unit or department — such as sales or finance

36
New cards

Data silo

Isolated data stored separately from the rest of an organization — inaccessible to other teams or systems

37
New cards

Data warehouse

A centralized repository of structured, historical data optimized for querying, reporting, and business intelligence

38
New cards

AWS (Amazon Web Services)

A leading public cloud provider offering computing, storage, databases, and analytics services

39
New cards

Azure

Microsoft's cloud computing platform providing infrastructure, AI, analytics, and database services

40
New cards

Private cloud

Cloud infrastructure operated solely for one organization — either on-premises or hosted exclusively by a third party

41
New cards

Public cloud

Cloud infrastructure owned by a third-party provider and shared across multiple organizations

42
New cards

Hybrid cloud

A mixed cloud environment that combines private and public cloud resources

43
New cards

Object storage

Stores data as discrete objects with metadata — ideal for unstructured data at scale (e.g., AWS S3)

44
New cards

File storage

Organizes data in a hierarchical folder/file structure — common in NAS (Network-attached Storage) systems

45
New cards

Block storage

Splits data into fixed-size blocks — used for databases and high-performance applications (e.g., SAN)

46
New cards

Local storage

Storage physically connected to or located on a single machine or device

47
New cards

Shared storage

Storage accessible by multiple users or systems simultaneously across a network

48
New cards

Containerization

Packaging an application and all its dependencies into an isolated container for consistent deployment across environments

49
New cards

IDE (Integrated Development Environment)

A software application combining a code editor, debugger, and build tools in one interface

50
New cards

RStudio

An IDE designed specifically for R programming — used for statistical computing and data analysis

51
New cards

VS Code (Visual Studio Code)

A lightweight, versatile IDE by Microsoft that supports Python, R, and many other languages

52
New cards

Tableau

A business intelligence tool used to create interactive dashboards and data visualizations

53
New cards

Power BI

Microsoft's BI tool for building reports and dashboards from multiple data sources

54
New cards

Looker

A Google-owned BI and data exploration platform for creating visualizations and data-driven applications

55
New cards

Anaconda

A Python/R distribution that includes Jupyter Notebook and common data science libraries pre-installed

56
New cards

pandas

A Python library for data manipulation and analysis — provides DataFrame structures for working with tabular data

57
New cards

tidyverse

A collection of R packages (including ggplot2 and dplyr) designed for data science workflows

58
New cards

SAS

A statistical software suite and programming language used for advanced analytics and data management

59
New cards

R

An open-source programming language designed for statistical computing and data visualization

60
New cards

Python

A versatile general-purpose programming language widely used in data analysis, automation, and machine learning

61
New cards

Scala

A programming language often used with Apache Spark for large-scale distributed data processing

62
New cards

SQL Server Management Studio (SSMS)

A Microsoft tool for managing and querying SQL Server databases

63
New cards

MySQL Workbench

A visual GUI tool for designing, managing, and querying MySQL databases

64
New cards

MongoDB Compass

A GUI tool for MongoDB used to explore, query, and visualize NoSQL document data

65
New cards

DBeaver

A free, universal database management tool that supports multiple database types

66
New cards

Toad

A database management tool primarily used for Oracle and SQL databases

67
New cards

Azure Data Studio

A cross-platform database tool by Microsoft for SQL Server, Azure SQL, and other data platforms

68
New cards

Generative AI

AI that can generate new content — such as text, images, or code — based on patterns learned from training data

69
New cards

LLM (Large Language Model)

A type of generative AI model trained on massive text datasets to understand and generate human language

70
New cards

Foundational model

A large-scale AI model trained on broad data that can be fine-tuned and adapted to a wide variety of tasks

71
New cards

Deep learning

A subset of machine learning using multi-layered neural networks to recognize complex patterns in large datasets

72
New cards

NLP (Natural Language Processing)

A branch of AI that enables computers to understand, interpret, and generate human language

73
New cards

RPA (Robotic Process Automation)

Technology that uses software robots to automate repetitive, rule-based business tasks such as automated reporting

74
New cards