W1: Introduction to Business Intelligence

0.0(0)

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/16

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

17 Terms

New cards

Business Intelligence (BI)

collection of processes, technologies, skills, and applications used to collect, integrate, analyze, and present business information.

involves the process of transforming data into actionable insights that inform an organization’s strategic and tactical business decisions.

New cards

Why Business Intelligence?

Helps generate insights from data to support business decisions and improve the profitability and efficiency

Strategic Decision Making: By bridging the gap between raw data and actionable knowledge, BI enables companies to make informed strategic decisions that can lead to competitive advantages.
Enhanced Efficiency: Properly implemented BI can significantly enhance operational efficiency by automating and optimizing processes based on insights drawn from data.
Improved Responsiveness: With actionable insights, businesses can respond more quickly to market changes and customer needs, enhancing agility.

New cards

Business Intelligence Process

Data Understanding

Data Preparation

New cards

Data Understanding

With preliminary analysis, data exploration provides a high-level overview of each variable in the dataset and interaction between the variables.

If a variable represents a characteristic measured in numbers, it is called a numeric variable.
If a variable consists of a set of categories, the variable is called a categorical variable.

New cards

Data Preparation

Before applying any data mining algorithms, it's crucial to prepare the dataset to ensure the integrity and accuracy of the analysis.

This involves addressing various anomalies that may affect the results:

Outlier Detection
Handling Missing Values
Removing Duplicates
Addressing Multicollinearity
Data Cleaning and Transformation

New cards

Data Preparation: Outlier Detection

Identify unusual entries, such as patients listed with ages over 120 years, which could skew analysis results.

New cards

Data Preparation: Handling Missing Values

Address gaps in data, which may lead to data sparsity, impacting the reliability of the mining process.

New cards

Data Preparation: Removing Duplicates

Eliminate duplicate records to prevent biased statistical results.

New cards

Data Preparation: Addressing Multicollinearity

Resolve highly correlated variables, such as age and date of birth, to improve the validity of regression models.

New cards

Data Preparation: Data Cleaning and Transformation

Cleanse the data by fixing or removing errors and inconsistencies, and transform data into a suitable format for analysis.

New cards

Data Mining Tasks

Supervised Learning

Unsupervised Learning

New cards

Supervised Learning

Method of machine learning, training a model on a dataset including both the input variables (x) and the output variable (Y)

Objective is to develop a mapping function that accurately predicts the output based on the input variables.

does not imply human guidance; refers to known output values in the training data, and allows the learning algorithm to evaluate its accuracy and adjust.
A predictive model, as used in supervised learning, is designed for tasks that require predicting a specific value from other data points within the dataset.

Classification & Numeric Prediction

<p>Method of machine learning, training a model on a dataset including both the <strong>input variables (x)</strong> and the <strong>output variable (Y)</strong></p><p>Objective is to develop a mapping function that accurately predicts the output based on the input variables.</p><ul><li><p>does not imply human guidance; refers to known output values in the training data, and allows the learning algorithm to evaluate its accuracy and adjust.</p></li><li><p>A predictive model, as used in supervised learning, is designed for tasks that require predicting a specific value from other data points within the dataset.</p><p></p></li></ul><p><strong>Classification </strong>& <strong>Numeric Prediction</strong></p>

New cards

Supervised Learning: Classification

involves predicting categorical labels. The output, or target variable, is a category rather than a continuous value.

Examples of classification problems include:

Determining whether an email message is spam or not spam.
Diagnosing whether a person has cancer based on medical test results.
Predicting whether a football team will win or lose a match.
Assessing if an applicant will default on a loan based on their financial history.

In each of these cases, the model is trained to assign discrete categories to the input data, making it a fundamental tool for decision-making across various fields.

New cards

Supervised Learning: Numeric Prediction

involves forecasting a continuous quantity. This type of prediction is crucial for tasks such as estimating future sales figures or predicting stock prices.

if a company has experienced steady monthly sales growth over the past few years, a linear analysis can be conducted using the historical monthly sales data.
Analysis helps the company to forecast sales for upcoming months, providing valuable insights for strategic planning and resource allocation.

<p>involves forecasting a continuous quantity. This type of prediction is crucial for tasks such as estimating future sales figures or predicting stock prices.</p><ul><li><p>if a company has experienced steady monthly sales growth over the past few years, a linear analysis can be conducted using the historical monthly sales data.</p></li><li><p>Analysis helps the company to forecast sales for upcoming months, providing valuable insights for strategic planning and resource allocation.</p></li></ul><p></p>

New cards

Unsupervised Learning

Machine learning approach that utilizes input data (X) without any associated labels.

Used to analyze and cluster unlabeled datasets to identify hidden patterns or natural groupings within the data, all without the guidance of a specific target outcome or human supervision.

algorithms autonomously learn the underlying structure of the data by identifying features and patterns independently.
process is crucial for discovering insights that are not immediately obvious, providing a foundational technique for exploratory data analysis and complex problem solving.

Clustering & Association Rule Mining

<p>Machine learning approach that utilizes input data (X) without any associated labels. </p><p>Used to analyze and cluster unlabeled datasets to identify hidden patterns or natural groupings within the data, all without the guidance of a specific target outcome or human supervision.</p><ul><li><p>algorithms autonomously learn the underlying structure of the data by identifying features and patterns independently.</p></li><li><p>process is crucial for discovering insights that are not immediately obvious, providing a foundational technique for exploratory data analysis and complex problem solving.</p><p></p></li></ul><p><strong>Clustering</strong> & <strong>Association Rule Mining</strong></p>

New cards

Clustering

involves grouping a set of objects such that objects within the same group are more similar to each other than to those in different groups. This technique is particularly useful in applications like customer segmentation.

Retailers may use clustering to segment customers based on their spending patterns and sensitivity to price changes.
Key variables for such segmentation might include total expenditure, the value of discounts received, and the number of items purchased at a discount.
By understanding these dynamics, businesses can tailor marketing strategies and product offerings to better meet the needs and preferences of distinct customer groups.

<p>involves grouping a set of objects such that objects within the same group are more similar to each other than to those in different groups. This technique is particularly useful in applications like customer segmentation.</p><ul><li><p>Retailers may use clustering to segment customers based on their spending patterns and sensitivity to price changes.</p></li><li><p>Key variables for such segmentation might include total expenditure, the value of discounts received, and the number of items purchased at a discount.</p></li><li><p>By understanding these dynamics, businesses can tailor marketing strategies and product offerings to better meet the needs and preferences of distinct customer groups.</p></li></ul><p></p>

New cards

Association Rule Mining

technique used in unsupervised learning to discover interesting relationships hidden in large datasets. It identifies rules that explain how or why certain items often occur together.

Uncover patterns such as customers who purchase item X also frequently buy item Y.

Insight can be used for effective cross-selling and upselling strategies, optimizing store layouts, and enhancing promotional campaigns targeted at increasing the sale of related products.