Data management :(

0.0(0)
Studied by 2 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/152

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 7:31 PM on 6/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

153 Terms

1
New cards
  • Data

  • information

  • knowledge

  • Wisdom

The DIKW pyramid

2
New cards

Raw data

data without context

3
New cards

information

data with context

4
New cards

knowledge

information with meaning

5
New cards

Wisdom

applied knowledge for better decisions

6
New cards
  1. ask question

  2. gather data

  3. prepare data

  4. analyze

  5. make decision

from data to decision

7
New cards

aggregations

translate raw data into summaries that are easier to understand

8
New cards

data governance

ensure data is consistent, trustworthy and isn’t misused

9
New cards

data quality

ensure data is accurate, valid, complete and consistent

10
New cards

data privacy and security

ensure proper data access, use and protection

11
New cards
  1. permission for data collection

  2. transparency about the plan

  3. privacy of data

  4. good intentions

  5. consider the outcome

5 principles of data ethics

12
New cards
  1. planning and collecting

  2. storing and managing

  3. cleaning and processing

  4. analyzing and visualizing

  5. sharing

  6. archiving/destroying

data life cycle

13
New cards

data literacy

the ability to read, work with, analyze, and communicate insights with data

14
New cards
  1. problem statement

  2. data collection

  3. data analysis

  4. communication

  5. action and reflection

5 steps data-driven process

15
New cards

Database management system

DBMS

16
New cards
  • extract

  • transform

  • load

ETL process

17
New cards

Structured Query Language

SQL

18
New cards

Dashboards

receives data from linked database and provides information at a glance. It is typically presented in a very visual way

19
New cards

descriptive analytics

  • get to know the data

  • investigated relationships

20
New cards

diagnostic analytics

find root causes of events

21
New cards

drill-down analysis

analysis where it goes from general to very deep underlaying data

22
New cards

Root cause analysis

To look beyond superficial actors that have a direct effect -> contributing facto

23
New cards

predictive analytics

identify possible outcomes and the probability that they will happen

24
New cards
  • classificarition-based

  • regression-based

machine learning models

25
New cards

recommendation engine

predicts interest based on past behavior

26
New cards

infographic

Densed summary to represent information simply

<p>Densed summary to represent information simply</p>
27
New cards
  1. Define the event

  2. collect relevant data

  3. determine contributing factors

  4. find root causes

  5. recommend possible solutions

steps root cause analysis

28
New cards

prescriptive analytics

determine the best course of action given the outcome we want to achieve

29
New cards
  1. introduce the visualization

  2. anticipated obviouse questions

  3. state the central insight

  4. provide supporting evidence

  5. closing statement

McCandless technique

30
New cards
  • data

  • narrative

  • visuals

components of data storytelling

31
New cards
  • focus

  • structure

  • form

3 keys to communicating effectively

32
New cards
  1. introduction

    1. data problem statement

    2. context

    3. objectives

  2. Body

    1. data

    2. analysis

    3. key findings

  3. conclusions

    1. insights

    2. recommendations

outline storytelling

33
New cards

records

table rows

34
New cards

field

table column

35
New cards

string

a sequence of characters such as letters or punctuation
ex: VARCHAR

36
New cards

Integers

store whole numbers

ex: INT

37
New cards

float

store numbers that include a fractional part

ex: NUMERIC

38
New cards

Keywords

reserved wordt for operations

ex: SELECT, FROM

39
New cards

View

a virtual table that is the result of a saved SQL SELECT statement

40
New cards

aliasing

rename columns

41
New cards

COUNT()

the number of records with a value in a field

42
New cards

DISTINCT

removes duplicates to return only unique values

43
New cards

remote dictionary server

redis

44
New cards

PostgreSQL

SQL flavor: free and open soucre

45
New cards

SQL Server

SQL flavor: created by microsoft

46
New cards
  • AVG()

  • SUM()

  • MIN()

  • MAX()

  • COUNT()

aggregate functions

47
New cards

Arithmetic

+, -, *, /

48
New cards
  1. attribute constraints

  2. key constraints

  3. referential Integrity constraints

Integrity constraints

49
New cards

key

attribute that identify a record uniquely

50
New cards

superkey

A key ware attributes can be removed

51
New cards

minimal superkey or key

a key where no more attributes can be removed

52
New cards

primary keys

uniquely dentifies records, chosen from candidate keys

53
New cards
  • FK constraint

  • referential integrity

a record referencing another table must refer to an existing record in that table

54
New cards

Online transaction processing

OLTP

55
New cards

Online analytical processing

OLAP

56
New cards
  • ETL

  • ELT

describing data flows

57
New cards

database models

high-level specifications for database structure

58
New cards

schemas

blueprint of the database

59
New cards
  1. conceptual data model

  2. logical data model

  3. physical data model

3 levels of data modeling

60
New cards

Dimensional modeling

adaptation of the relational model for data warehouse design

61
New cards

star schema

knowt flashcard image
62
New cards

snowflake schema

knowt flashcard image
63
New cards

normalization

divides tables into smaller tables and connects them via relationships

64
New cards
  • First normal form (1NF)

  • Second normal form (2NF)

  • Third normal form (3NF)

  • Elementary key normal form (EKNF)

  • Boyce-Codd normal form (BCNF)

  • Fourth normal form (4NF)

  • Essential tuple normal form (ETNF)

  • Fifth normal form (5NF)

  • Domain-key normal form (DKNF)

  • Sixth normal form (6NF)

Normal forms

65
New cards
  1. update anomaly

  2. insertion anomaly

  3. deletion anomaly

data anomalies (normalization)

66
New cards

materialized views

stores the query results on a disk

67
New cards

Directed Acyclic Graphs

keep track of view (managing dependencies)

68
New cards

partitioning

split table into smaller parts

69
New cards

vertical partitioning

split table by columns

70
New cards

horizontal partitioning

split table up over rows

71
New cards

sharding

horizontal partitioning is applied to spread a table over several machines

72
New cards

Data integration

combines data from different sources, formats, technologies to provide users with a translated and unified view of that data

73
New cards
  • data

  • database schema

  • database engine

DBMS manages

74
New cards

SQL DBMS

Relational DBMS

75
New cards

noSQL DBMS

Document-centered DBMS

76
New cards
  • key-value store

  • document store

  • columnar database

  • graph database

types of NoSQL DBMS

77
New cards

key-value store

knowt flashcard image
78
New cards

document store

knowt flashcard image
79
New cards

Columnar database

knowt flashcard image
80
New cards

Graph database

knowt flashcard image
81
New cards
  • user profiles and user preferences

  • shopping carts

  • real-time recommendations

  • advertizing

suitable cases key-value databases

82
New cards
  • search data by its value

  • related data

unsuitable cases key-value databases

83
New cards

Remoste Dictionary Server

Redis afkorting

84
New cards

Redis

popular key-value database

85
New cards

atomic operation

the operation i guaranteed to either complete fully or not at all

86
New cards

editoo

small business that uses Redis

87
New cards

JSON

most popular format document database

88
New cards

Polymorphic

documents within the same collection don't need to have the same structure

89
New cards
  • catalogs

  • event logging

  • user profiles

  • content management systems

  • real-time analytics

suitable cases document databases

90
New cards
  • very structured data

  • always have consistent data

unsuitable cases document database

91
New cards

MongoDB

type of document database

92
New cards

Binary JSON

BSON

93
New cards
  • MQL

  • MongoDB Query Language

language MongoDB

94
New cards
  • Atomicity

  • Consistency

  • Isolation

  • Durability

ACID transactons

95
New cards
  • MongoDB Compass

  • MongoDB Atlas

  • MongoDB Enterprise Advanced

  • MongoDB Atlas Data Lake

  • MongoDB Charts

  • Realm Mobile Database

Products MongoDB

96
New cards

Shutterfly

company that uses MongoDB

97
New cards
  • large volumes of data

  • extreme write speeds

  • event logging

  • content Managing systems

  • time-series data

suitable cases column family databases

98
New cards
  • prototyping and at the beginning of a project

  • complex queries and joins

  • not dealing with large amounts of data

unsuitable cases

99
New cards

Apache Cassandra

most popular column familie databases

100
New cards

Cassandra Query Language (CQL)

Cassandra language