C207 Master Story Guide: Data-Driven Decision Making

0.0(0)

Studied by 0 people

View linked note

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/95

Earn XP

Description and Tags

Flashcards based on the C207 Master Story Guide, covering essential definitions and keywords for Data-Driven Decision Making.

Last updated 12:42 AM on 7/1/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai	Chat

No analytics yet

Send a link to your students to track their progress

96 Terms

New cards

Big Data

Data so large and complex that it is difficult to process using traditional database and spreadsheet tools, often including both structured and unstructured data.

New cards

Big Data Warehouse

A storage environment for Big Data often utilizing cloud storage, third-party storage, or multiple servers.

New cards

Data Mining

The process of discovering useful patterns in large datasets.

New cards

Organizations' Motivation for Data Collection

To transform data into useful information to make better business decisions and encourage specific buying behavior.

New cards

Structured Data

Data in a fixed, preformatted format that is easy to classify, count, or filter, such as multiple choice survey responses or check boxes.

New cards

Unstructured Data

Data without a fixed format that often requires interpretation or theme analysis, such as medical notes, emails, or free-text reviews.

New cards

Quantitative Data

Numerical data that can be counted or measured, such as price, revenue, or temperature.

New cards

Qualitative Data

Categorical data that describes qualities, labels, or categories, such as car color or vehicle type.

New cards

Discrete Data

Quantitative data that uses whole-number counts where decimals are not possible, such as the number of children in a household.

New cards

Continuous Data

Quantitative data that is measured and can include decimals or intervals, such as height, weight, or distance.

New cards

Nominal Level

Categorical data with no natural order or sequence, such as colors or yes/no responses.

New cards

Ordinal Level

Categorical data with a meaningful sequence or ranking, but unequal intervals between categories, such as economy, business, and first class.

New cards

Interval Level

Numerical data where zero is a placeholder on a scale and does not mean absence, such as temperature in Fahrenheit.

New cards

Ratio Level

Numerical data where zero means the absolute absence of what is being measured, such as money, price, or distance.

New cards

Reliability

The consistency and repeatability of a measurement instrument's results.

New cards

Validity

The extent to which data measures the intended concept or target.

New cards

Data Quality

The process of cleaning data to ensure it does not contain mistakes, missing values, or impossible entries before analysis.

New cards

Out-of-Range Error

A value in a dataset that is impossible or suspicious because it falls outside the expected range, such as a car listed at $188\,MPG$ .

New cards

Omission Error

A missing value or blank field within a record in a dataset.

New cards

Systematic Error

A consistent, repeated bias that pushes results in the same wrong direction and must be fixed.

New cards

Random Error

Error caused by chance or temporary noise that may average out over time.

New cards

Observational Study

A study where researchers collect information without applying a treatment to the subjects.

New cards

Cohort Study

An observational study focused on a specific group that shares a characteristic, place, or time frame.

New cards

Experimental Study

A study where a treatment is applied to a unit to examine the effect on a response.

New cards

Blind Study

An experimental setup where participants do not know which treatment they are receiving.

New cards

Double-Blind Study

An experimental setup where neither the participant nor the researcher knows which treatment was assigned.

New cards

Triple-Blind Study

An experimental setup where the participant, researcher, and data analyst are all unaware of the treatment assignments.

New cards

Faulty Operationalization

A flaw in research design where a variable or concept is not clearly defined or is measured with an incorrect target.

New cards

Measurement Bias

Bias introduced by the method of sample selection or how data collection is performed.

New cards

Information Bias

Bias resulting from inaccurate or distorted information provided by respondents or records after data collection has started.

New cards

Response Bias

Bias caused by the presence of an interviewer or the pressure felt by a respondent to answer in a certain way.

New cards

Conscious Bias

Bias introduced through the use of leading or persuasive wording in a question.

New cards

Association vs. Causation

The principle that two variables moving together (association) does not prove that one directly causes the other (causation).

New cards

Alpha Level (Significance Level)

The cutoff used to decide statistical significance, which in this course is set at $.05$ .

New cards

p-value rule for Significance

If $p < .05$ , the result is significant and the null hypothesis is rejected; if $p > .05$ , the result is not significant and the null hypothesis is accepted.

New cards

Null Hypothesis

A statement asserting that there is no significant difference or no significant relationship between variables.

New cards

Chi-Square Analysis

A statistical test used to compare frequency counts or categories for nominal data.

New cards

T-Test

A statistical test used to compare the means or averages of exactly two groups.

New cards

ANOVA (Analysis of Variance)

A statistical test used to compare the means or averages of three or more groups.

New cards

Independent Variable (X)

The predictor or input variable used in a regression model to predict an outcome.

New cards

Dependent Variable (Y)

The outcome or response variable being predicted in a regression model.

New cards

Linear Regression

A statistical tool that uses one independent variable to predict one numeric dependent variable.

New cards

Multiple Regression

A statistical tool that uses two or more independent variables to predict a single dependent variable.

New cards

Logistic Regression

A statistical tool used to predict a dependent variable that is binary or nominal, such as yes/no or pass/fail.

New cards

R-Squared

A measure of the goodness of fit for a regression model, indicating how much variation in Y is explained by X, with values closer to $1$ being stronger.

New cards

Homoscedasticity

A condition where the scatter plot shows consistent variation or spread of data points around the trend line.

New cards

Heteroscedasticity

A condition where the scatter plot shows changing or unequal variation, often forming an ice-cream-cone shape.

New cards

Decision Tree Analysis

A tool used to choose between alternatives under risky or uncertain conditions by identifying the highest expected value.

New cards

Expected Value

A weighted outcome value calculated by combining payoffs and their respective probabilities.

New cards

Linear Programming

A mathematical method used to find an optimal solution (maximize or minimize) while meeting specific constraints.

New cards

Break-Even Analysis

A method used to determine the point where total revenue equals total cost and profit begins.

New cards

Cross-Over Analysis

A method used to compare cost-per-volume between alternatives to find the best option based on variable usage.

New cards

Cluster Analysis

A method used to group similar observations or customers together, often used for market segmentation.

New cards

Monte Carlo Simulation

A simulation technique that uses many random outcomes to model uncertainty and forecast possible results.

New cards

Probability

The chance an event occurs, calculated as favorable outcomes divided by total opportunities.

New cards

Complement

The probability that an event does not occur, calculated as $1 - P(A)$ .

New cards

Intersection

The probability that two events happen together (AND), often requiring multiplication.

New cards

Union

The probability that one event or the other occurs (OR), calculated by adding the individual probabilities and subtracting any overlap.

New cards

Conditional Probability (Bayes Theorem)

The probability of an event occurring given that another event is already known to have happened.

New cards

Combination

A counting technique used to determine how many possible groups can be formed from a set of items where order does not matter.

New cards

Mean

The arithmetic average calculated by adding all values and dividing by the count; it is sensitive to outliers.

New cards

Median

The middle value in a sorted dataset, representing the $50^{th}$ percentile.

New cards

Mode

The value that appears most frequently in a dataset.

New cards

Standard Deviation

The measure of data spread or volatility, calculated as the square root of high variance.

New cards

Empirical Rule

A rule for normal distributions stating that approximately $68\%$ , $95\%$ , and $99.7\%$ of data falls within $1$ , $2$ , and $3$ standard deviations of the mean.

New cards

Z-Score

A standardized score that indicates how many standard deviations a specific value is above or below the mean.

New cards

Interquartile Range (IQR)

The distance between the third quartile ( $Q3$ ) and the first quartile ( $Q1$ ), representing the middle $50\%$ of data.

New cards

Box Plot

A visual summary showing the five-number summary: minimum, $Q1$ , median, $Q3$ , and maximum, along with potential outliers.

New cards

PDCA (Deming Cycle)

A four-stage systematic cycle for solving quality problems: Plan (investigate), Do (test/pilot), Check (measure), and Act (standardize).

New cards

SIPOC

A high-level process map identifying the Supplier, Input, Process, Output, and Customer.

New cards

Quality Assurance (QA)

Proactive activities focused on preventing defects before they occur, such as employee training.

New cards

Quality Control (QC)

Reactive activities focused on identifying and fixing defects after they occur, such as inspection and repair.

New cards

Run Chart

A line graph showing performance or data points over a specific period of time.

New cards

Control Chart

A run chart that includes upper and lower control limits to determine if process variation is within an acceptable range.

New cards

Common Cause Variation

Normal, routine noise or variation that occurs within expected control limits.

New cards

Special Cause Variation

Unusual or outlier variation that falls outside of expected control limits.

New cards

Cause and Effect Diagram

Also known as a fishbone or Ishikawa diagram; used to brainstorm the 'why' behind a problem.

New cards

Flow Chart

A diagram showing the step-by-step sequence of a process to identify 'where' a failure might be occurring.

New cards

Check Sheet

A simple tool used to collect and tally frequency data.

New cards

Histogram

A chart showing the distribution of numerical data across specific ranges or bins.

New cards

Pareto Chart

A bar graph that ranks categories from highest to lowest frequency to help prioritize problem-solving efforts.

New cards

Scatter Diagram

A visual tool used to show the relationship between two variables using dots on an X-Y chart.

New cards

Lean

A quality management program focused primarily on the reduction of waste and improvement of efficiency.

New cards

Six Sigma

A quality management program focused on reducing process variation and defects.

New cards

Just-in-Time (JIT)

An operational approach focused on reducing inventory levels by receiving materials only when needed.

New cards

Results-Based Management (RBM)

An ongoing monitoring framework focused on achieving intended results, outcomes, and long-term impact, often used in nonprofits.

New cards

Index Number

A comparison of a current value relative to a base period, often expressed as $\frac{\text{current}}{\text{base}} \times 100$ .

New cards

Incidence

A metric representing the number of new cases of a disease or event within a specific time.

New cards

Prevalence

A metric representing the total number of existing cases in a population at a specific point in time.

New cards

Observed Score

A score calculated as the sum of the True Score plus any random and systematic error.

New cards

Criterion-Referenced Score

A score compared against a fixed standard or 'cut score' rather than against other individuals.

New cards

Key Performance Indicator (KPI)

A single, specific metric used to measure progress toward an organizational goal.

New cards

SMART Goal

A goal that is Specific, Measurable, Achievable, Relevant, and Time-bound.

New cards

Dashboard

A visual display that allows managers to monitor multiple important metrics and KPIs at a glance in one place.

New cards

Balanced Scorecard

A strategic framework viewing performance from four perspectives: Financial, Customer, Internal Process, and Learning and Growth.

New cards

Net Promoter Score (NPS)

A metric used to measure customer loyalty based on their willingness to recommend a company to others.