1/152
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Data
information
knowledge
Wisdom
The DIKW pyramid
Raw data
data without context
information
data with context
knowledge
information with meaning
Wisdom
applied knowledge for better decisions
ask question
gather data
prepare data
analyze
make decision
from data to decision
aggregations
translate raw data into summaries that are easier to understand
data governance
ensure data is consistent, trustworthy and isn’t misused
data quality
ensure data is accurate, valid, complete and consistent
data privacy and security
ensure proper data access, use and protection
permission for data collection
transparency about the plan
privacy of data
good intentions
consider the outcome
5 principles of data ethics
planning and collecting
storing and managing
cleaning and processing
analyzing and visualizing
sharing
archiving/destroying
data life cycle
data literacy
the ability to read, work with, analyze, and communicate insights with data
problem statement
data collection
data analysis
communication
action and reflection
5 steps data-driven process
Database management system
DBMS
extract
transform
load
ETL process
Structured Query Language
SQL
Dashboards
receives data from linked database and provides information at a glance. It is typically presented in a very visual way
descriptive analytics
get to know the data
investigated relationships
diagnostic analytics
find root causes of events
drill-down analysis
analysis where it goes from general to very deep underlaying data
Root cause analysis
To look beyond superficial actors that have a direct effect -> contributing facto
predictive analytics
identify possible outcomes and the probability that they will happen
classificarition-based
regression-based
machine learning models
recommendation engine
predicts interest based on past behavior
infographic
Densed summary to represent information simply

Define the event
collect relevant data
determine contributing factors
find root causes
recommend possible solutions
steps root cause analysis
prescriptive analytics
determine the best course of action given the outcome we want to achieve
introduce the visualization
anticipated obviouse questions
state the central insight
provide supporting evidence
closing statement
McCandless technique
data
narrative
visuals
components of data storytelling
focus
structure
form
3 keys to communicating effectively
introduction
data problem statement
context
objectives
Body
data
analysis
key findings
conclusions
insights
recommendations
outline storytelling
records
table rows
field
table column
string
a sequence of characters such as letters or punctuation
ex: VARCHAR
Integers
store whole numbers
ex: INT
float
store numbers that include a fractional part
ex: NUMERIC
Keywords
reserved wordt for operations
ex: SELECT, FROM
View
a virtual table that is the result of a saved SQL SELECT statement
aliasing
rename columns
COUNT()
the number of records with a value in a field
DISTINCT
removes duplicates to return only unique values
remote dictionary server
redis
PostgreSQL
SQL flavor: free and open soucre
SQL Server
SQL flavor: created by microsoft
AVG()
SUM()
MIN()
MAX()
COUNT()
aggregate functions
Arithmetic
+, -, *, /
attribute constraints
key constraints
referential Integrity constraints
Integrity constraints
key
attribute that identify a record uniquely
superkey
A key ware attributes can be removed
minimal superkey or key
a key where no more attributes can be removed
primary keys
uniquely dentifies records, chosen from candidate keys
FK constraint
referential integrity
a record referencing another table must refer to an existing record in that table
Online transaction processing
OLTP
Online analytical processing
OLAP
ETL
ELT
describing data flows
database models
high-level specifications for database structure
schemas
blueprint of the database
conceptual data model
logical data model
physical data model
3 levels of data modeling
Dimensional modeling
adaptation of the relational model for data warehouse design
star schema

snowflake schema

normalization
divides tables into smaller tables and connects them via relationships
First normal form (1NF)
Second normal form (2NF)
Third normal form (3NF)
Elementary key normal form (EKNF)
Boyce-Codd normal form (BCNF)
Fourth normal form (4NF)
Essential tuple normal form (ETNF)
Fifth normal form (5NF)
Domain-key normal form (DKNF)
Sixth normal form (6NF)
Normal forms
update anomaly
insertion anomaly
deletion anomaly
data anomalies (normalization)
materialized views
stores the query results on a disk
Directed Acyclic Graphs
keep track of view (managing dependencies)
partitioning
split table into smaller parts
vertical partitioning
split table by columns
horizontal partitioning
split table up over rows
sharding
horizontal partitioning is applied to spread a table over several machines
Data integration
combines data from different sources, formats, technologies to provide users with a translated and unified view of that data
data
database schema
database engine
DBMS manages
SQL DBMS
Relational DBMS
noSQL DBMS
Document-centered DBMS
key-value store
document store
columnar database
graph database
types of NoSQL DBMS
key-value store

document store

Columnar database

Graph database

user profiles and user preferences
shopping carts
real-time recommendations
advertizing
suitable cases key-value databases
search data by its value
related data
unsuitable cases key-value databases
Remoste Dictionary Server
Redis afkorting
Redis
popular key-value database
atomic operation
the operation i guaranteed to either complete fully or not at all
editoo
small business that uses Redis
JSON
most popular format document database
Polymorphic
documents within the same collection don't need to have the same structure
catalogs
event logging
user profiles
content management systems
real-time analytics
suitable cases document databases
very structured data
always have consistent data
unsuitable cases document database
MongoDB
type of document database
Binary JSON
BSON
MQL
MongoDB Query Language
language MongoDB
Atomicity
Consistency
Isolation
Durability
ACID transactons
MongoDB Compass
MongoDB Atlas
MongoDB Enterprise Advanced
MongoDB Atlas Data Lake
MongoDB Charts
Realm Mobile Database
Products MongoDB
Shutterfly
company that uses MongoDB
large volumes of data
extreme write speeds
event logging
content Managing systems
time-series data
suitable cases column family databases
prototyping and at the beginning of a project
complex queries and joins
not dealing with large amounts of data
unsuitable cases
Apache Cassandra
most popular column familie databases
Cassandra Query Language (CQL)
Cassandra language