Data Warehousing and Business Intelligence Flashcards

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/63

flashcard set

Earn XP

Description and Tags

Flashcards covering key concepts in Data Warehousing (DWH), ETL, DWH Schemas, Dimension Tables, Fact Tables, and Business Intelligence (BI).

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

64 Terms

1
New cards

What is a Data Warehouse (DWH)?

A repository that stores current and historical data from various sources for decision support, analytics, business intelligence, and data mining.

2
New cards

What is Data Warehousing?

The process of building and using a data warehouse to support analytical reports, ad hoc queries, and decision making.

3
New cards

What is an Ad Hoc Query?

A dynamically constructed query meant to answer a specific question at a particular moment, without a predefined structure.

4
New cards

What is an Ad Hoc Report?

Created for one-time use, shared on the spot, often done by non-technical BI users, and designed for a single specific purpose.

5
New cards

What is OLAP (Online Analytical Processing)?

Software used to perform multidimensional analysis at high speed on large volumes of data from a data warehouse or data mart.

6
New cards

What does it mean for a data warehouse to be Subject-Oriented?

Designed to analyze specific subjects like Products, Customers, Sales, Revenues, giving a complete view of a subject.

7
New cards

What does it mean for a data warehouse to be Integrated?

Combines data from various heterogeneous sources such as flat files (CSV) and relational databases (SQL Server, Oracle, DB2).

8
New cards

What does it mean for a data warehouse to be Time-Variant?

Stores historical data over consistent time intervals (e.g., every 3 or 6 months).

9
New cards

What does it mean for a data warehouse to be Non-Volatile?

Old data is not deleted when new data is added.

10
New cards

What is Information Processing in the context of DWH usage?

Produces statistical analysis and reports in tables, charts, graphs, etc.

11
New cards

What is Analytical Processing (OLAP) in the context of DWH usage?

Supports OLAP operations like Slice & Dice, Drill Down, and Pivoting.

12
New cards

What is Data Mining in the context of DWH usage?

Discovers hidden patterns and associations in the data; used to build analytical models.

13
New cards

What is a DWH Schema?

A blueprint or architecture — a logical structure describing how data is organized and related in a data warehouse.

14
New cards

What is a Star Schema?

Named for its star-like shape with a central Fact Table connected to multiple Dimension Tables.

15
New cards

What is a Fact Table?

Stores measurements, metrics, or facts related to a business process (e.g., revenue, sales).

16
New cards

What is a Dimension Table?

Contains descriptive attributes related to facts (e.g., Product details, Salesperson info, Time, Location).

17
New cards

What is a Snowflake Schema?

An extension of the Star Schema where data is normalized into sub-dimension tables to reduce redundancy.

18
New cards

What is a Fact Constellation Schema (Galaxy Schema)?

A collection of star schemas where multiple fact tables share dimension tables.

19
New cards

What is a Data Lake?

All data types (structured, semi-structured, unstructured); schema is applied after storing (Schema-on-Read); ideal for deep analysis & data scientists; uses ELT.

20
New cards

What is a Data Warehouse in comparison to a Data Lake?

Structured data with schema defined before storing (Schema-on-Write); ideal for operational users & BI analysts; uses ETL.

21
New cards

What is OLTP (Online Transaction Processing)?

Day-to-day operations with small, frequent transactions using traditional DBMS.

22
New cards

What is OLAP (Online Analytical Processing) in comparison to OLTP?

Business analysis and reporting with large volumes of historical data using a Multidimensional DB.

23
New cards

What is a Data Warehouse in terms of scope, size and complexity?

Scope is the entire organization, size is 100GB – 1TB+, and design complexity is complex with build time of 1 month – 1 year.

24
New cards

What is a Data Mart in Comparison to a Data Warehouse?

Scope is a specific department, size is < 100GB, and design complexity is simple with build time of a few weeks – few months.

25
New cards

What does ETL stand for?

Extract, Transform, Load – a core process in Data Warehousing.

26
New cards

What is the Extraction phase in ETL?

Pulling raw data from different sources such as relational databases, NoSQL databases, XML, and flat files.

27
New cards

What is a staging area used for in ETL?

A temporary storage area used to ensure safe transformation and rollback, as raw data may be inconsistent or corrupted.

28
New cards

What is the Transformation phase in ETL?

Applying rules or functions to standardize the data, including filtering, cleaning, standardization, deduplication, joining, splitting, and sorting.

29
New cards

What is the Loading phase in ETL?

Loading transformed data into the final Data Warehouse; can be a Full Load or Incremental Load.

30
New cards

What is a Full Load in ETL?

All records are loaded from scratch every time.

31
New cards

What is an Incremental Load in ETL?

Only new or changed data is loaded; efficient and ideal for ongoing updates.

32
New cards

What is ELT (Extract, Load, Transform)?

Extract, Load, Transform – where data is loaded into the DWH first, then transformed within the warehouse.

33
New cards

What are the key characteristics of ETL?

Transformation occurs before loading, staging area is required, less flexibility, faster for small data, lower complexity, mature and stable tools.

34
New cards

What are the key characteristics of ELT?

Transformation occurs after loading, staging area is not required, high flexibility, better for large-scale raw data, higher setup & processing logic, still evolving tools.

35
New cards

Describe the Business Intelligence (BI) Lifecycle.

Data is collected from internal and external sources, converted into usable information using tools, presented in reports, and monitored to predict future outcomes.

36
New cards

What is Business Intelligence (BI)?

Converting raw data into meaningful insights to support business decisions.

37
New cards

What is the Information Gathering stage of BI?

Data is collected from internal sources (CRM, ERP, financial databases) and external sources (surveys, forms, polls).

38
New cards

What is the Analysis stage of BI?

Raw data is converted into usable information using tools such as spreadsheets, custom queries, and visualization tools.

39
New cards

What is the Reporting stage of BI?

Analyzed data is presented in a human-readable format through tables, graphs, dashboards, and executive summaries.

40
New cards

What is the Monitoring stage of BI?

Real-time observation of key metrics through dashboards and KPIs to catch issues and observe performance.

41
New cards

What is the Prediction stage of BI?

Uses current and historical data to predict future outcomes through data mining and predictive modeling.

42
New cards

What are Degenerate Dimensions?

A dimension key without a corresponding dimension table, existing within the fact table.

43
New cards

What are Junk Dimensions?

Combining low-cardinality fields into a single table to reduce clutter in fact tables.

44
New cards

What are Role-Playing Dimensions?

A single dimension appearing multiple times in a fact table, each with a different meaning.

45
New cards

What are Conformed Dimensions?

A dimension that has the same meaning and structure across multiple fact tables, ensuring consistency in reporting.

46
New cards

What are Outrigger Dimensions?

A dimension that references another dimension, creating chained relationships.

47
New cards

What are Snowflake Dimensions?

A normalized dimension table — split into multiple related tables to reflect hierarchy.

48
New cards

What are Slowly Changing Dimensions (SCDs)?

Tracking historical changes in dimension attributes using various techniques (Type 0 to Type 5+).

49
New cards

What are Fast Changing Dimensions?

Attributes that change very frequently; moved to a separate Mini-Dimension to avoid too many rows.

50
New cards

What are Shrunken Rollup Dimensions?

Summarized versions of existing dimensions used for aggregate fact tables.

51
New cards

What are Multi-Valued Dimensions?

One fact row relates to multiple dimension values.

52
New cards

What are Swappable Dimensions?

A dimension that has alternate versions of itself, selected at query time, each with different structure and attributes.

53
New cards

What are Heterogeneous Dimensions?

Used when a business serves multiple product types with different attributes but a shared customer base.

54
New cards

What is a Fact Table?

Stores quantitative data (facts or measures), describes business events or processes, and contains foreign keys to dimension tables.

55
New cards

What are the steps to design a Fact Table?

Choose the business process, identify the grain, identify the dimension tables, and define the facts.

56
New cards

What is a Transaction Fact Table?

One row per transaction, grows quickly, captures fine-grained data.

57
New cards

What is a Periodic (Aggregated) Fact Table?

One row per time period (e.g., daily, monthly) used for summary reporting.

58
New cards

What is an Accumulated Snapshot Fact Table?

One row represents an entire process lifecycle, updated over time as milestones are completed.

59
New cards

What are Additive Facts?

Can be summed across all dimensions; the most common type.

60
New cards

What are Semi-Additive Facts?

Can be summed across some dimensions but not across time.

61
New cards

What are Non-Additive Facts?

Cannot be summed across any dimension; usually ratios or percentages.

62
New cards

What are Derived Facts (Calculated Facts)?

Created from other facts; may or may not be stored physically.

63
New cards

What are Textual Facts?

Text data such as indicators or flags; should generally be avoided in fact tables.

64
New cards

What are Factless Fact Tables?

No numerical measures, only foreign keys used to record events or associations.