DP-900

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/62

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

63 Terms

1
New cards

Data

a collection of facts such as numbers, descriptions, and observations used to record information

2
New cards

Structured Data

data that adheres to a fixed schema with the same fields or properties, often stored in databases using a relational model

3
New cards

Semi-Structured Data

information with some structure but allows variation, commonly represented in formats like JSON

4
New cards

Unstructured Data

data without a specific structure, including documents, images, audio, and video files

5
New cards

File Storage

the ability to store data in files, often done in local systems or shared file storage systems in the cloud

6
New cards

Delimited Text Files

data stored in plain text format with specific field delimiters, common formats include CSV, TSV, and space-delimited

7
New cards

JSON

a format with a hierarchical document schema used to define data entities, flexible for structured and semi-structured data

8
New cards

XML

a human-readable data format using tags enclosed in angle brackets to define elements and attributes

9
New cards

BLOB

binary data stored in formats that must be interpreted by applications, common for images, video, audio, and binary files

10
New cards

OLTP

Online Transactional Processing, systems optimized for read and write operations to support transactional workloads

11
New cards

OLAP

Online Analytical Processing, systems optimized for analytical workloads, aggregating data for reporting and visualization

12
New cards

Data Lakes

used in large-scale data analytical processing scenarios to collect and analyze file-based data

13
New cards

Data Warehouses

store data in a relational schema optimized for read operations to support reporting and data visualization

14
New cards

Data Lakehouses

combine the storage of a data lake with the querying semantics of a data warehouse, requiring some denormalization

15
New cards

Database Administrators

manage databases, ensure availability, performance, security, and backup and recovery plans

16
New cards

Data Engineers

manage data integration, cleansing, transformation, and pipelines across the organization

17
New cards

Data Analysts

explore and analyze data to create visualizations and insights for informed decision-making

18
New cards

Azure SQL

a family of relational database solutions on Microsoft Azure, including Azure SQL Database, Managed Instance, and SQL VM

19
New cards

Azure SQL Database

A family of MS SQL Server based database services in Azure, including SQL Server on Azure Virtual Machines, suitable for migrations and applications requiring access to operating system features.

20
New cards

Relational Database

Models collections of entities as tables, each row representing an instance of an entity, and each column storing data of a specific datatype.

21
New cards

Normalization

A schema design process that minimizes data duplication, enforces data integrity, and involves separating entities into tables, attributes into columns, and using primary and foreign keys.

22
New cards

SQL

Standard Query Language used to communicate with relational databases, with common statements like SELECT, INSERT, UPDATE, DELETE, and dialects like T-SQL, pgSQL, and PL/SQL.

23
New cards

Views

Virtual tables based on SELECT query results, allowing data filtering similar to tables.

24
New cards

Stored Procedures

Defines SQL statements for programmatic logic in databases, encapsulating actions for applications working with data.

25
New cards

Indexes

Structures that help search for data in tables by containing sorted data copies with pointers to corresponding rows, enabling quicker data retrieval.

26
New cards

Virtual Machine (VM)

Allows you to develop and test traditional applications, with full administrative rights over the DBMS and operating systems, suitable for organizations with existing IT resources.

27
New cards

Azure SQL Managed Instance

A platform-as-a-service (PaaS) option providing near-100% compatibility with on-premises SQL Server instances, automating software updates, backups, and maintenance tasks.

28
New cards

Azure SQL Database

A fully managed, highly scalable PaaS database service designed for the cloud, available as Single Database or Elastic Pool options.

29
New cards

MySQL

An open-source DBMS, a PaaS implementation in Azure with high availability, scalability, and automatic backups.

30
New cards

MariaDB

A fully managed DBMS controlled by Azure, offering high availability, predictable performance, and secure data storage.

31
New cards

PostgreSQL

A hybrid relational-object database, supported in Azure with the Flexible Server deployment option for high availability and server configuration customizations.

32
New cards

Azure Blob Storage

Enables storing unstructured data as blobs in the cloud, with containers for grouping related blobs and three types of blobs:Block Blobs, Page Blobs, and Append Blobs.

33
New cards

Access Tiers

Hot Tier for frequently accessed blobs, Cool Tier for infrequently accessed data, and Archive Tier for historical data with low storage cost.

34
New cards

Lifecycle Management Policy

Automatically moves blobs between access tiers based on age to optimize storage costs and performance.

35
New cards

Azure Data Lake Store (Gen1)

A service for hierarchical data storage for analytical data lakes, used by big data analytical solutions for structured, semi-structured, and unstructured data

36
New cards

Azure Data Lake Storage Gen2

An integrated service in Azure Storage combining scalability of blob storage, cost-control of storage tiers, hierarchical file system capabilities, and compatibility with major analytics systems

37
New cards

Azure Files

Cloud-based network shares for storing and sharing files in Azure, eliminating hardware costs, providing high availability, and scalable cloud storage

38
New cards

Azure Table Storage

NoSQL storage solution using tables with key/value data items, each represented by a row with columns for data fields, enabling storage of semi-structured data

39
New cards

Partitioning

Mechanism in Azure Table Storage for grouping related rows based on a common property or partition key to improve scalability, performance, and data organization

40
New cards

Azure Cosmos DB

Highly scalable DBMS supporting multiple APIs for relational and non-relational workloads, providing fast read and write performances, and enabling multi-region writes

41
New cards

Data Warehousing Architecture

Involves data injection and processing, analytical data store, analytical data model, and data visualization for large-scale data analytics

42
New cards

Data Ingestion Pipelines

Orchestrate ETL processes for large-scale data ingestion, can be created and run using Azure Data Factory, Azure Synapse Analytics, or Microsoft Fabric

43
New cards

Analytical Data Stores

Includes Data Warehouses, relational databases optimized for data analytics, and File-system based data lakes for large analytics.

44
New cards

Star Schema

A schema where numeric values from a transactional store are stored in central fact tables related to dimension tables, forming a star-like structure for data aggregation.

45
New cards

Snowflake Schema

An extension of a star schema where additional tables are added to represent dimensional hierarchies related to the dimension tables.

46
New cards

Data Lakehouses

A file store on a distributed file system for high-performance data access, supporting structured, semi-structured, and unstructured data for analysis without strict schema enforcement.

47
New cards

SQL Pools

In Azure Synapse Analytics, includes PolyBase to define external tables based on files in a data lake for querying using SQL.

48
New cards

Batch Processing

Processing method where data records are collected and processed together in a single operation, suitable for handling large datasets efficiently.

49
New cards

Stream Processing

Real-time data processing method where data is processed as individual units as they arrive, ideal for time-critical operations requiring instant responses.

50
New cards

Azure Synapse Analytics

A PaaS service for large-scale data analytics, combining data integrity and reliability of SQL Server with the flexibility of a data lake and Apache Spark.

51
New cards

Azure Databricks

Azure's implementation of the Databricks platform, built on Apache Spark for data analytics and data science with workload-optimized Spark clusters.

52
New cards

Microsoft Fabric

A unified Software-as-a-Service (SaaS) offering with OneLake architecture for scalable analytics, providing a single environment for data collaboration.

53
New cards

Stream Processing Architecture

Involves event data generation, capture, processing, and output, with technologies like Azure Stream Analytics, Spark Structured Streaming, and Azure Data Explorer.

54
New cards

Delta Lake

An open-source storage layer that adds support for transactional consistency, schema enforcement, and other data warehousing features to data lake storage.

55
New cards

Real-time Analytics

Utilizing streaming data in Spark-based data lakes or analytical data stores for immediate analysis.

56
New cards

Stream Processing

Data is processed continually as new data records arrive.

57
New cards

Azure Stream Analytics

Service used to continually capture data from an IoT Hub, aggregate it over temporal periods, and store results in Azure SQL Database.

58
New cards

Power BI Tools

Suite of tools and services for building interactive data visualizations for business users to consume.

59
New cards

Dimension Tables

Represent entities for aggregating numeric measures in data modeling.

60
New cards

Fact Tables

Store numeric measures associated with recorded events in data modeling.

61
New cards

Hierarchies

Enable drill-up or drill-down analysis to find aggregated values at different levels in analytical models.

62
New cards

Data Visualization

Various types include tables, bar/column charts, line charts, pie charts, scatter plots, and maps for effective communication of data.

63
New cards

Power BI Desktop

Tool used to import data from multiple sources, create data models, and design interactive reports for visualization.