IDSC 3001 Topics 11 and 12

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/91

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 11:57 PM on 4/13/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

92 Terms

1
New cards

Business Analytics (BA)

include software tools and applications used to build models and simulations to create scenarios, understand current events and predict future states.

- Often includes data mining, predictive analytics, applied analytics and statistics.

2
New cards

Business Intelligence (BI)

includes a variety of tools, platforms, and methodologies that enable organizations to collect data from internal systems and external sources, prepare it for analysis, develop and run queries, create reports, dashboards and data visualizations to decision-makers.

- Only effective if it is trusted and used to guide human decisions

- Ex: Tableau

3
New cards

Canned Reports

provide regular summaries of information in a predetermined format (can be difficult to alter).

- Often developed by information systems staff

- Ex: if you buy a CRM, these _______ come out of the box

4
New cards

Ad Hoc Reporting Tools

puts users in control so that they can create custom reports on an as-needed basis by selecting fields, ranges, summary conditions, and other parameters.

- Not programming, but drag-and-drop

5
New cards

Dashboards

heads-up display of critical indicators that allows managers to get a graphical glance at key performance metrics.

- Ex: sales department __________ would allow you to see a dip in sales

6
New cards

Online Analytical Processing (OLAP)

a method of querying that takes data from standard relational databases, calculates and summarizes the data, and then stores the data in a special database called a data cube.

- Similar to pivot tables

7
New cards

Data Cube

special database used to store data in OLAP reporting.

8
New cards

Anatomy of a Basic Report

- Headers

- Report Level Filters

- Row Data

- Column Data

- Row Totals

- Column Totals

- Grand Total

- Footer

9
New cards

Query Tools

a tool to interrogate a data source or multiple sources and return a subset of data, possibly summarized, based on a set of criteria.

- Returns raw data we can use

- Less formatted than ad hoc and canned reports

10
New cards

Python

a general purpose programming language that is also popular for data analytics

- Not specifically a data query and analysis tool, but has many add-ons

11
New cards

R

a programming language specifically created for analytics, statistical, and graphical computing.

12
New cards

Graphical Query Tools

allow a user to create a query through a point-and-click or drag-and-drop interface, rather than requiring programming knowledge.

- Allows users to specify how data should be combined and summarized

- Easier-to-use products are often less flexible

13
New cards

Importance of Data Visualization

well-designed ____ ____________:

- Puts data in context

- Provides prospective

- Saves time

- Reveals trends

- Tells a story

14
New cards

Four Types of Business Analytics

1. Descriptive

2. Diagnostic

3. Predictive

4. Prescriptive

15
New cards

Descriptive (Business Analytics)

- Explain the "what"

- Ex: reports, dashboards

- Least complex

16
New cards

Diagnostic (Business Analytics)

- Explain the "why" (i.e. why trends or events occurred)

- Ex: statistical modeling, OLAP, data mining

- Second-least complex

17
New cards

Predictive (Business Analytics)

- Predicts

- Ex: basic AI, forecasting, pattern recognition, recommender engine

- Second-most complex

18
New cards

Prescriptive (Business Analytics)

- Optimizes

- Ex: advanced AI, simulations, econometrics, price optimization, autonomous decision making

- Most complex

- Tells us what to do

- Ex: taking the data of the last 3 years, analyzing sales, and determining staffing needs for certain days

19
New cards

Palantir

US tech firm that got its start in Iraq and Afghanistan. Now provides software to police departments, other public agencies, and corporations

- After Google walked away from the Pentagon's Project Maven which is developing AI unmanned drones for bombings, ________ took over.

- Worth $200.26 B on 3/28/2025

- Opponents say it's a pro-military organization filled with former military, intelligence, and gov insiders that has won multiple government/military contracts w/o competitive bids (very profitable). Also point to it 'predictive policing' capabilities and raise concerns that personal data is collected in anticipation of crimes.

- Supporters point to their work uncovering human trafficking rings, finding exploited children, and solving sophisticated financial crimes.

20
New cards

Process of Analyzing Data

1. Sources

2. Data Integration

3. Data Hubs (Big Data)

4. Business Intelligence

21
New cards

A/B Testing

a firm releases two versions of a website, half customers see one version, half see another version, and the company analyzes results

- Uses real data and research

22
New cards

Business Intelligence Criteria

- Accuracy

- Timeliness

- Valuable Insights

- Actionable

23
New cards

Fundamental Methods of Business Analytics

- Clustering

- Classifying

- Estimating and Predicting

- Affinity Grouping

24
New cards

Clustering

recognizing distinct groupings or sub-categories within the data.

- Ex: how do we treat customers differently?

25
New cards

Classifying

taking a lot of data and using data points to organize the data.

- Ex: examining a customer as credit worthy or credit unworthy.

- Ex: email inbox rating emails based on how likely they are to be a scm

26
New cards

Estimating and Predicting

two similar activities that normally yield a numerical measure as the result. From the set of existing customers we may estimate the overall indebtedness of the candidate customer.

- Ex: in general, if you look at large populations, what happens to salary as people get older? It typically increases

- Ex: forecasting as a company

27
New cards

Affinity Grouping

a special kind of clustering that identifies events or transactions that occur simultaneously/together.

- Looks at temporal data and clustering within those events

- Ex: a well-known example is market basket analysis. When people go to the store, what kinds of items do they buy together (i.e. bacon and eggs)?

28
New cards

Business Analytics Best Practices

- Know the objective for using __.

- Define your business use case and the goal ahead of time.

- Define your criteria for success and failure.

- Select your methodology and be sure you know the data and relevant internal and external factors

- Validate models using your predefined success and failure criteria

29
New cards

Business Analytics Challenges

- Risk of spending lots of money and time chasing poorly defined problems or opportunities.

- Trying to implement in fast-moving markets. Fixing a plane while it is flying.

- Mistaking noise for true insight.

- Trying to make BA perfect the first try.

- Not accessing the correct data.

- Time and effort to clean the data.

- Potential for data privacy issues

30
New cards

Banking (Use Case of Analytics)

- The banking industry is data-intensive with typically massive graveyards of unused and unappreciated ATM and credit processing data. As banks face increasing pressure to stay profitable, understanding customer needs/preferences becomes a critical success factor.

- New models of proactive risk management are being increasingly adopted by major banks and financial institutions (i.e. for assessing credit risk)

- By using data mining and advanced analytics techniques, banks are better equipped to manage market uncertainty, minimize fraud, and control risk.

- Banks can gain insights that encompass all types of customer behavior, including channel transactions, account opening and closing, default, fraud and customer departure (all very rich for data analytics).

31
New cards

Predicting Customer Churn (Use Case of Analytics)

Valpak, one of North America's leading direct marketing companies, recognized an opportunity to boost customer retention using insights from historical customer behavior data. However, the Valpak team was unsure if their data could predict which customers are likely to leave and why, and they lacked the data science expertise necessary to extract those insights

- Challenges: use case (reducing ________ _____, feasible based on historical company data), identify key data to produce the most valuable insights, select various data science tools and techniques, developing a low-risk proof-of-concept.

- Process: data scientists had to prepare the company's data and performed exploratory analysis to summarize trends. Identify which data could predict customer churn. Determine which features would help improve the model's performance. Now Valpak can quantify the value of each customer and determine whether or not customer retention tactics are necessary.

32
New cards

Data Mining

the process of using computers to identify hidden patterns in, and to build models from, large datasets.

- The process of discovering meaningful correlations, patterns and trends in nonobvious relationships.

- Key areas of leverage: customer segmentation, marketing and promotion targeting, market basket analysis, collaborative filtering, customer churn, fraud detection, financial modeling, hiring and promotion

- Ex: accounts payable of companies

- Ex: hiring

33
New cards

Prerequisites for Data Mining

- Org must have clean, consistent data

- Events in that data should reflect trends

34
New cards

Problems with Using Bad Data

- Can give wrong estimates, thus exposing the firm to risk.

- When the market does not behave as it has in the past, computer-driven investment models are not effective.

- Possible to over-engineer

35
New cards

Over-Engineering

building a model with so many variables that the solution arrived at might only work on the subset of data used to create it.

- Ex: looking at relationships between data in the past, and expecting it to be exactly the same in the future

36
New cards

Critical Skills for Data Mining and Business Analytics Team

- Information Technology

- Statistics

- Business Knowledge

37
New cards

Walmart Data Mining Use Case

discovered that customers stock up when hurricanes are predicted. Beer is a top pre-storm seller and Pop-Tarts spike seven-fold before the storm hits.

- Data mining helps optimize operational forecasts: predicting things like how many cashiers are needed at a given store at various times throughout the day.

- Maintains what it calls its Data Café, a 40-petabyte Hadoop-based data mart.

- Included in this analytic center is massive amounts of social media information from Twitter, Facebook, and other unstructured data. They use this data to gain insights into product offerings, sales leads, pricing, etc.

38
New cards

Artificial Intelligence (AI)

computer software that can mimic or improve upon functions that would otherwise require human intelligence.

- Data mining has its roots in __.

- An explosion of tools is fueling the current spread (new generation of hardware chips, cloud resources, open-source algorithms, software development kits, data-capture tools)

- Traditional computing works well with CPUs, __ works well with GPUs

39
New cards

Graphical Processing Unit (GPU)

processor for graphics, originally used for video games and now are also used for AI.

40
New cards

Machine Learning (ML)

software that contains the ability to learn or improve without being explicitly programmed

- This is where most work and investment is going on in terms of AI

- A type of AI that leverages massive amounts of data so that computers can improve the accuracy of actions and predictions on their own without additional programming

41
New cards

Deep Learning

a type of machine learning that uses multiple layers of interconnections among data to identify patterns and improve predicted results.

- Most often uses a set of techniques known as neural networks and is popularly applied in tasks like speech recognition, image recognition, and computer vision.

- Takes data, finds patterns, and uses those patterns to find more patterns, and so on. Uses these patterns of patterns to make decisions

- The issue is that you can't explain how it got to its answer (business audits don't always allow this)

42
New cards

Supervised Learning

a type of machine learning where algorithms are trained by providing explicit examples of results sought, like defective vs. error-free, or stock price. Uses training data to learn the relationship of a given input to a given output

- Not necessarily involving a human

- Need previous data to build rules to predict future data

- Ex: teaching a machine to differentiate between pugs and chocolate chip cookies: (1) create an algorithm that is categorized/labeled with input data to learn from, (2) provide the algorithm unbiased info, see if it labels the new data correctly.

- Ex: circuit board manufacturers finding defects

- Ex: reading x-rays and CAT scans

43
New cards

Unsupervised/Self-Supervised Learning

systems build pattern-recognizing algorithms using data that has not been pre-classified. Uses strong statistical models (i.e. regression)

- Getting a dataset, finding a pattern in regression, and predicting the outcome of a variable from another (using regression)

- No need to train the ML model before it starts learning

- Some form of this is usually used to train LLMs, like OpenAI's GPT models, and researchers at Google used this in a robot that "taught" itself to walk.

44
New cards

Semi-Supervised Learning

a type of machine learning wherethe data used to build models contains data with explicit classifications, but is also free to develop its own additional classifications that may further enhance result accuracy.

45
New cards

Neural Networks

examines data and hunts down and exposes patterns, in order to build models to exploit findings.

- Build multilayered relationships that humans can't detect on their own

- Adds weights and mappings to a combo of inputs

46
New cards

Expert Systems

AI systems that leverages programmed rules or examples to perform a task in a way that mimics applied human expertise

- Used in tasks ranging from medical diagnoses to product configuration

47
New cards

Genetic Algorithms

model building techniques where computers examine many potential solutions to a problem, iteratively modifying various math models for a best alternative function.

- AI technologies that seek an optimal model by transforming or "mutating" an algorithm (vs. neural networks, which add weights and mappings to a combination of inputs)—iteratively testing the result and choosing the best outcome.

48
New cards

Machine Intelligence Layers

- Deep Learning (innermost layer)

- Machine Learning

- Artificial Intelligence

- Machine Intelligence

49
New cards

CAPTCHA

an acronym standing for Completely Automated Public Turing Test to tell Computers and Humans Apart.

- Used for public Turing tests to distinguish bots from humans

50
New cards

Turing Test

conceived by Alan Turing, a test of software's ability to exhibit behavior equivalent to, or indistinguishable from, a human being.

51
New cards

OCR (Optical Character Recognition)

software that can scan images and identify text within them.

52
New cards

Generative AI

a combination of ML techniques which, when combined in a system, create human-like text, images, or other media in response to a prompt.

- These systems use large language models or massive samples of media as learning models to generate responses. They often generate large numbers of responses then use a second step to compare to human-generated work to find the most human-like response.

- This technology "learns" the patterns and structure of data used during training, and can then generate new output based on the characteristics of this input.

- Results can often be refined by further prompt entry.

- Ex: ChatGPT, DALL-E 2, and Bing AI.

- There are concerns about bias, errors, and proper attribution of these systems.

53
New cards

Parameters

values that are used to determine text elements and relationships and that are further refined during training. Created by the foundation model

- The more __________ that are in a model, the more complex and comprehensive the resulting LLM will be. This also makes it more expensive

- Still not "thinking" or "intelligent"

54
New cards

Corpus

in AI, this refers to the data used to train a model before it can be used.

- If you give it data (i.e. company trade secrets), it will incorporate that data into the model

- A system's ability to produce results is limited by the corpus used during training.

55
New cards

Prompt

a request made to a generative AI system, usually in the form of written or spoken text.

56
New cards

Prompt Engineering

the practice of designing inputs for generative AI tools that will produce optimal outputs.

- Learning how to talk to AI to get desired results

- Can be a job itself

- Many GenAI systems also allow you to refine results through entering additional information through subsequent inputs.

57
New cards

Hallucination

an incorrect answer provided by generative AI that is otherwise presented as correct. Not based on facts

- GenAI does not know the difference between real and fake

- In some cases, AI will be confidently wrong based on statistics

- Appears to be made-up info

- This happens not because AI is sinister or trying to lie, but because systems struggle to find a set of related terms and concepts. The AI strings together a set of words that are a best match mathematically, but that don't result in facts.

- Ex: some lawyers could lose their license due to GenAI making up cases while writing a brief

- Ex: LLMs are not usually good at math

58
New cards

Artificial General Intelligence (AGI)

refers to software that's capable of learning and reasoning on any task or subject, including developing reasoning about topics not presented through a training corpus.

59
New cards

Agentic AI

autonomous systems that can break down complex problems and take actions with minimal human intervention.

- Capabilities: perceives environment through data or sensors, sophisticated reasoning, executing actions autonomously, adapts based on outcomes

- Ex: customer service, inventory management, security threat detection, code generation (initial drafts)

60
New cards

Advertising Reach

how many people a given ad will get to

61
New cards

Regression

predicts continuous values such as age, price, salary, etc. based on a second parameter.

- Creates a line that we can use to make predictions

- Ex: a ___________ line would show us that salary generally increases as age increases

- AI uses ___________ and allows us to do things that are not always precise

62
New cards

Classification (AI)

predicts discrete values such as true/false, spam/not spam, credit worthy/not creditworthy, etc.

- Must be able to show that AI is auditable, so we cannot use deep learning or neural networks for these tasks

- Plots the data and gets a mathematical formula, still occasionally giving use false positives and negatives (AI is not 100% accurate)

63
New cards

Correlation

a connection or mutual relationship between two or more variables.

- __________ does not equal causation. Other mutual relationships may be in play.

- Weak __________ are less accurate but still useful

64
New cards

Negative Correlation

a relationship between two more variables that moves in opposite directions

65
New cards

Correlation Strength (Correlation Coefficient)

a measure of how strongly variables are related

- Ex strong positive: age and salary (highly related but not perfectly)

66
New cards

No Correlation

there is no relationship between the variables

67
New cards

Unsupervised Cluster Algorithm

lets the AI find clusters using some statistical tool, and can find clusters in hundreds of thousands of dimensions.

- Allows us to do anomaly detection

- There would be no way to create these groups with classical computing (i.e. without using statistics)

68
New cards

Reasons Why Clustering is Used

- Market Segmentation

- Anomaly Detection

- Search Results Analysis

- Group Documents by Topic

69
New cards

K-Means Clustering Process

finds k clusters using the following process:

1. Choose two points at random

2. Assign clusters to these points

3. Calculate cluster centers

4. Reassign clusters to the centers

5. Recalculate cluster centers

6. Reassign clusters to the centers

7. Repeat until the dots are the same

- Example of unsupervised AI (no training data)

70
New cards

Issues and Concerns with AI

- Data quality, inconsistent data, or the inability to integrate data sources into a single dataset capable of input into machine learning systems can all stifle efforts.

- Not enough data.

- Technical staff may require training in developing and maintaining such systems, and such skills are rare.

- Involve "change management" that goes hand-in-hand with many IS projects.

- Some types of ML may be legally prohibited because of the data used or the inability to identify how a model works and whether or not it might be discriminatory.

- The negative unintended consequences of data misuse might also lead to regulation that limits techniques currently used

- Many workers are startled to find that in the US, just about anything done on organizational networks or using a firm's computer hardware can be monitored.

- Firms that gain an early lead and benefit from scale may be in a position to collect more data than competitors, fueling a virtuous cycle where early winners generate more data, have stronger predictive capabilities, and can have an edge in entering new markets, offering new services, attracting customers, and cutting prices.

- Being implemented so quickly that firms are not thinking about cybersecurity, bias, and ethics

71
New cards

Change Management

anytime companies implement technical change (AI or not AI), the hardest part is getting employees to change as well.

- Without helping people and having support, they will resist and push back

72
New cards

Three Levels of AI-Driven Change

1. Tasks and Occupations (ex: use of machine vision to ID cancer cells)

2. Business Processes (ex: reinvention of workflow and layout of Amazon centers, after introducing robots and ML optimization algorithms)

3. Business Model (ex: ML recommending music/movies in a personalized way)

73
New cards

Autonomous Trucks (Use Case of AI)

mining sites doing digging, dumping into trucks, moving it elsewhere to process it. Companies are implementing self-driving vehicles that run themselves. There might be a human sitting inside, but the driving is using AI.

- Normal process is manually intensive, must be choreographed by experienced people. If people are sick or need a break, it can mess up the whole situation. Very dangerous on a huge mine with heavy equipment

- Orchestration System: looks at the whole line and coordinates that different equipment is at the right places at the right time

- Safety increase because they need less people. Saves money because you're not paying people to stand around

- Difficult when humans and AI vehicles interact. It will be easier when all cars are autonomous

74
New cards

Intelligent Trading Systems (Use Case of AI)

many big trading houses have been experimenting with AI and implementing it. Speed and efficiency are critical in the trading industry

- Ex: whenever a company does an earnings statement (sharing how the company did on the quarter, the market reacts fast in terms of good/bad announcements), AI can analyze the second the company releases the earning statements. Analyzes words, does sentiment analysis (positive or negative sentences, analyzing for surprises). AI immediately invests or shorts based on the earnings statement second after the statement releases

- Algorithmic, high-frequency trading

- There have been 5 or 6 times when the market has crashed because of this. Many investment firms use this AI. When one company has a glitch and starts selling off, all the other AI start jumping on that. Creates a downward spiral. There is nothing to warrant these downward drops. There have been times where they needed to stop the markets, tell the firms to fix their algorithms, potentially reverse trades, and start the market back up

75
New cards

Customer Segmentation

figuring out which customers are likely to be the most valuable to a firm.

76
New cards

Marketing and Promotion Targeting

identifying which customers will respond to which offers at which price at what time.

77
New cards

Market Basket Analysis

determining which products customers buy together, and how an organization can use this information to cross-sell more products or services.

78
New cards

Collaborative Filtering

personalizing an individual customer's experience based on the trends and preferences identified across similar customers.

79
New cards

Customer Churn

determining which customers are likely to leave, and what tactics can help the firm avoid unwanted defections.

80
New cards

Fraud Detection

uncovering patterns consistent with criminal activity.

81
New cards

Financial Modeling

building trading systems to capitalize on historical trends.

82
New cards

Hiring and Promotion

identifying characteristics consistent with employee success in the firm's various roles.

83
New cards

Black Swans

events so extreme and unusual that they never showed up in the data used to build a model

84
New cards

Omnichannel

providing customers with a unified experience across customer channels, which may include online, mobile, catalog, phone, and retail. Pricing, recommendations, and incentives should reflect a data-driven, accurate, single view of the customer.

- Ex: implemented by L.L. Bean

85
New cards

The AI Winter

a period of reduced interest in and funding for AI projects. This resulted from a failure of AI to deliver on much-hyped initial promises. Advances in fast/cheap processor and storage tech, cloud computing and access to tremendous amounts of parallel computing, high-speed broadband, and the massive amount of input data from the Internet and other sources has enabled several advances in AI during the past decade, plus.

- The advance of generative AI have further pushed interest in AI to an all-time high in terms of investment and use.

86
New cards

Large Language Model (LLM)

a type of AI that is used for general-purpose language understanding and generation. Trained by putting a corpus of training data through a foundation model.

- A type of neural network

- Forms the. basis of most generative AI

87
New cards

Natural Language Processing (NLP)

using an LLM to interpret language

88
New cards

Foundation Model

the model or base technology used to train a large language model.

- Ex: ChatGPT-3, OpenAI's GPT-3 is the base model that is used to train the LLM that users eventually use to ask and answer questions.

- Good at breaking apart text and creating multitiered relationships between the individual elements of text (words, word fragments, punctuation, numbers, etc). But it won't know about language until it looks at lots of text, breaks out the components of text, and uses these to identify complex, multitiered logical relationships as well as concepts like style and context

89
New cards

Reinforcement Learning from Human Feedback (RLHF)

a machine learning training technique that uses a reward model and human evaluators that will provide feedback to continually tune results. These models usually include some sort of reward function, and human ratings and feedback can guide training to favor certain outcomes and avoid others.

90
New cards

Constitutional AI

a method for providing alignment and safety in an AI by incorporating a set of specific rules or guidelines that an AI must follow as machine learning takes place.

91
New cards

Transformer ("T" in GPT)

allows all words in a given body of text to be analyzed in parallel rather than sequentially. Allow GenAI to consider lots of relationships (words, concepts, style, context) together. Also allow models to be trained far quicker on a much larger amount of data, and to generate results that consider complex relationships much more quickly. The important pieces of text are identified as worthy of attention (hence the paper's name).

- Lets GPT and other GenAI produce sophisticated, human-like answers that go beyond the simple trick of auto-complete.

- Also used to create AI solutions for audio, video, computer vision, and more.

92
New cards

How LLMs Are Created

1. Foundation model is built (i.e. GPT-3)

2. Feed huge amounts of data to foundation model

3. Trained model is what users interact with