BUSINESS INTELLIGENCE (Class 2-3)

0.0(0)

Studied by 0 people

Call with Kai

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/40

Earn XP

Description and Tags

Vocabulary flashcards covering key concepts from the 'BUSINESS INTELLIGENCE' lecture notes, including data cubes, OLAP operations, types of databases, data mining patterns and techniques, and data cleaning methods.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

41 Terms

New cards

Data Cube

A multi-dimensional data structure used in business intelligence to represent data along dimensions like Time, Location, and Product.

New cards

OLAP Operations

A set of analytical operations used on data cubes, including Roll-Up, Drill-Down, Slice and Dice, and Pivot.

New cards

Roll-Up

An OLAP operation that performs aggregation on a data cube by climbing up to a higher level of information.

New cards

Drill-Down

An OLAP operation that starts with high-level information and subdivides it further into more detailed levels (e.g., year into month, weeks, days).

New cards

Slice and Dice

An OLAP operation that performs analysis on a small, specific part of a given data cube.

New cards

Pivot (Rotate)

An OLAP operation used for visualization that rotates the data axes to provide an alternative view on the data.

New cards

Traditional Databases (Relational Databases)

Also known as Relational Databases or DBMS, used for defining database structure, data storage, concurrent access, and consistent information storage, typically using SQL and storing data in normalized forms.

New cards

DBMS

An acronym for Database Management Systems, often synonymous with Relational Databases, used for data storage, access, and structure definition.

New cards

Object-Oriented Databases

Databases based on object-oriented programming, where each entity is an object containing variables, messages for communication, and methods to return values.

New cards

Object-Relational Databases

An extension of object-oriented databases that provides rich data types to handle complicated objects, class hierarchies, and object inheritance.

New cards

Spatial Databases

Databases that contain spatial-related information, such as geographical data, chip designs, medical equipment, or satellite images, with capabilities to display and analyze this data.

New cards

Temporary & Time-Series Databases

Databases that store short time-related or time-evolving attributes, such as stock trades, used to calculate trends or object evolution over time.

New cards

Text & Multimedia Databases

Databases capable of storing word descriptions, reports, notes (can be unstructured/semi-structured), and multimedia assets like images, music, and video.

New cards

Heterogeneous Databases

Databases that contain elements from different types of database systems.

New cards

Legacy Databases

Databases that support legacy data, often having a long lifespan due to factors like government regulations.

New cards

Descriptive Patterns

Patterns that characterize general properties of data, such as totals, often used in reports.

New cards

Predictive Patterns

Patterns that perform intensive calculations on current data to classify new data based on already learned patterns.

New cards

Classification and Prediction

The process of finding models or functions to create descriptions and training datasets, often using decision trees, to classify new data.

New cards

Decision Tree

A flow-chart-like description with an "If-Then" structure, where each node tests an attribute, each branch represents an outcome, and each leaf represents a class or class description.

New cards

Characterization (Concept/Class Description)

A summary of data being studied, describing general properties (e.g., "Over the course of last year we have sold 20 software titles").

New cards

Discrimination (Concept/Class Description)

A comparison of relevant data, such as possible growth or decline over time, where data under comparison must be comparable and have a valid business reason.

New cards

Cluster Analysis

A statistical method that focuses on identifying characteristics that bind data together, such as location or product, to group similar data points.

New cards

Outliers

Objects that do not comply with the general behavior or model of data, often detected using statistical tests and used to detect fraud.

New cards

Data Warehousing

Architectures and tools for business executives to systematically organize, understand, and use data to make strategic decisions.

New cards

Noisy Data

Data that contains errors, outliers, or inconsistencies, often requiring methods like binning or smoothing for cleaning.

New cards

Binning

A technique for handling noisy data by partitioning a dataset into smaller groups or "bins."

New cards

Mean

The "average" value in a dataset.

New cards

Median

The "middle" value in a dataset when ordered.

New cards

Mode

The number that occurred the most often in a dataset.

New cards

Range

The difference between the highest and lowest value in a dataset.

New cards

Partition into Equidepth

A binning method where each bin or bucket has the same number of values.

New cards

Smoothing by Bin Mean

A data cleaning method where each value in a bin is replaced with the mean (average) of that bin.

New cards

Smoothing by Bin Median

A data cleaning method where each value in a bin is replaced with the median (middle value) of that bin.

New cards

Smoothing by Bin Mode

A data cleaning method where each value in a bin is replaced with the mode (most frequent value) of that bin.

New cards

Smoothing by Bin Range

A data cleaning method where each value in a bin is replaced with the range (difference between highest and lowest value) of that bin.

New cards

Smoothing by Bin Boundaries

A data cleaning method where values in a bin are replaced with the closest value from the bin's boundaries.

New cards

AND Decision

A logical decision rule where the output is true only if all input conditions are true.

New cards

OR Decision

A logical decision rule where the output is true if at least one input condition is true.

New cards

XOR Decision

A logical decision rule where the output is true if exactly one input condition is true (exclusive OR).

New cards

Information Gain

A measure used in decision tree learning to decide the effectiveness of an attribute in classifying data; high gain indicates a more effective attribute.

New cards

Granular Data

Data that is detailed and specific, allowing for more precise analysis and insights.