Azure Data Fundamentals Certification (DP-900)

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/65

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

66 Terms

1
New cards

SaaS (Software as a Service)

a product that is run and managed by the service provider.

2
New cards

PaaS (Platform as a Service)

focused on the deployment and management of your apps.

3
New cards

Infrastructure as a Service (Iaas)

the building blocks for cloud IT. provides access to networking features, computers, and storage space.

4
New cards

database administrator

configures and maintains a databse.

responsibilities:

  • database management

  • manages security, granting user access

  • backups

  • monitors performance

5
New cards

data engineer

design and implement data tasks related to the storage of big data.

responsibilities:

  • database pipelines and process

  • data ingestion storage

  • prepare data for analytics

  • prepare data for analytical processing

6
New cards

data analyst

analyzes business data to reveal important information.

responsibilities:

  • provides insights into the data

  • visual reporting

  • modeling data for analysis

  • combines data for visualization

7
New cards

data

units of information that could be in the form of numbers, text or machine code, images, videos, audio or physical

8
New cards

data documents

defines the collective form in which data exists

9
New cards

data sets

logical grouping of units of data that are generally closely related and/or share some data structure

10
New cards

data structures

structured data

11
New cards

data types

a single unit of data that tells a compiler or interpreter how data is supposed to be used

12
New cards

batch and streaming data

how do we move our data around?

13
New cards

relational and non-relational

how do we access, search, and query our data?

14
New cards

data modeling

how do we prepare and design our data?

15
New cards

schemas and schemaless

how do we structure our data for search?

16
New cards

data integrity and search?

how do we trust our data?

17
New cards

normalized and de-normalized

how do we trade quality vs. speed?

18
New cards

schema

a formal language which describes the data structure of a database

19
New cards

schemaless

when the primary “cell” of database can accept many types

20
New cards

query

a request for data results (reads) or to perform operations like inserting, updating, or deleting data within a database

21
New cards

data result

results the data returned from query

22
New cards

querying

the act of performing a query

23
New cards

query language

a scripting or programming language designed as the format to submit a request or action to a databaseused to manage and manipulate data.

24
New cards

batch processing

when a collection of data is sent to be processed

25
New cards

stream processing

when data is processed as soon as it arrives to enable real-time analytics and immediate responses.

26
New cards

tables

a logical grouping of rows and columns

27
New cards

views

a result set of a stored query on data stored in memory

28
New cards

materialized view

a result of a stored query on data in a disk

29
New cards

indexes

a copy of data that is sorted by one or multiple columns for faster reads at the cost of storage

30
New cards

constraints

rules applied to writes, that can ensure data integrityt

31
New cards

trigger

a function that is triggered on specific database events

32
New cards

primary key

one or multiple columns that uniquely identify a table

33
New cards

foreign key

column that holds primary key from other key to establish a relationship to maintain referential integrity in relational databases

34
New cards

relational databases

establishes relationships to other tables through foreign keys referencing another tables primary key

35
New cards

one-to-one

a type of relationship in relational databases where each row in one table is linked to a single row in another table

36
New cards

one-to-many

a type of relationship in relational databases where a row in one table can be related to many rows in another table

37
New cards

many-to-many

a type of relationship in relational databases where multiple rows in one table can be related to multiple rows in another table

38
New cards

many-to-many (via join/junction table)

A relationship in relational databases where multiple records in one table are associated with multiple records in another table, typically facilitated through a join or junction table.

39
New cards

row-store

  • data organized in rows

  • traditional relational databases are row-store

  • good for general purpose databases

  • suited for online transaction processing

  • good when needing all columns in a row

    • not the best at analytics or massive amounts of data

40
New cards

column-store

  • data is organized into columns

  • NoSQL or SQL-like databases

  • great for vast amounts of data

  • suited for online analytical processing

  • good when only a few columns needed

41
New cards

database indexes

a data structure that improves the speed of reads from the database table by storing the same or partial redundant data in a more efficient logical order

42
New cards

data integrity

the maintenance and assurance of, data accuracy and consistency over its entire life cycle

43
New cards

goal of data integrity

ensures data is recorded exactly as intended

44
New cards

data corruption

the act or state of data not being in the intended state will result in a data loss or malfunction

45
New cards

normalized data

a schema design to store redundant and consistent data

46
New cards

denormalized data

a schema that combines data so that accessing data is fast and efficient.

47
New cards

pivot table

a table that summarizes the data of a more extensive table from a: database, spreadsheet, or business intelligence (BI) tool

48
New cards

data consistently

  • when data is being kept in two different places and whether the data exactly matches or does not match

  • when having duplicates of your data in many places and need to keep up-to-date

49
New cards

strongly consistent

  • every time you request data (query) you can expect consistent data to be returned within a certain amount of time

  • will never return old data, but will have to wait at least 2 seconds for query to return

50
New cards

eventually consistent

  • when you request data it may be inconsistent within 2 seconds

  • getting back whatever data is currently in the database, could be old or new data, if you wait a little longer it will be up-to-date

51
New cards

synchronous

continuous stream of data that is synchronized by a timer or clock (guarantee of time)

  • guaranteed consistency at time of access, slower access times

52
New cards

asynchronous

continuous stream of data separated by start and stop bits (no guarantee of time)

  • faster access time, no guaranteed consistencyfaster access times with potential inconsistency

53
New cards

Non-relational data

a non-table form of storing data and will be optimized for different kinds of data structures

54
New cards

data source

where data originally comes from; an analytics tool may be connected to multiple data source to create a visualization or report

55
New cards

data store

a repository for persistently storing and managing collections of unstructured and semi structured data

56
New cards

database

a data-store that stores semi-structured and structured dataused to manage, retrieve, and manipulate data. It often consists of tables that relate to one another.

57
New cards

data warehouse

a relational/non-relational database designed for analytical workloads, which is generally column-oriented data storeoptimized for querying and reporting.

58
New cards

data mart

allows different teams or departments to have control over their own dataset

  • a subset of a data warehouse

  • will store under 100gb and has a single business focus

59
New cards

data lake

a centralized storage repository that holds vast amounts of raw data in its native format until it's needed for analysisand can store structured, semi-structured, and unstructured data.

60
New cards

data lakehouse

combines the best factors of a data warehouse and a data laketo provide a unified analytics platform that supports both structured and unstructured data.

61
New cards

data lakehouses compared to data warehouse

  • support video, audio, and text files

  • support data science and ML workloads

  • support for both streaming and ELT

  • work with open-source formats

    • data resides in data lake or blob stores

62
New cards

data lakehouses compared to data lake

  • perform BI tasks well

  • easier to setup and maintain

  • has management features to prevent data lake turning into data swamp

  • more performant than a data lakeoffer better data governance and quality management

63
New cards

data structures

data that is organized in a specific storage format, which enables easy access and modification

64
New cards

unstructured

a bunch of lose data that has no organization or possible relation

65
New cards

semi-structured

data that can be browsed or search (with limitations)

66
New cards

structured

data that can be easily browsed or searched