1/18
These flashcards cover key terms and definitions related to data analysis and modeling concepts.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data Cleaning
The process of transforming raw, messy data into a structured, usable format.
Exploratory Data Analysis (EDA)
The process of summarizing and visualizing a dataset to understand its main characteristics before formal modeling.
Imputation
A technique used to fill missing data using mean, median, or mode for numeric data.
Univariate Analysis
Examines one variable at a time, such as computing mean or median.
Bivariate Analysis
Explores the relationship between two variables, such as correlation or scatter plots.
Multivariate Analysis
Involves three or more variables to understand complex interactions.
Descriptive Analytics
Explains what has happened, such as summarizing historical sales.
Predictive Analytics
Forecasts what could happen using historical data and models, like sales forecasting.
Prescriptive Analytics
Recommends actions based on predictions, like pricing strategies to maximize profit.
SQL JOIN
Combines rows from two or more tables based on a related column.
INNER JOIN
Returns rows with matching values in both tables.
LEFT JOIN
Returns all rows from the left table and matched rows from the right table.
VLOOKUP
A function that searches for a value in the first column of a table and returns a value in the same row from a specified column.
Pivot Table
A tool in Excel that summarizes large datasets by grouping and aggregating data.
Subquery
A query nested within another SQL query, used in SELECT, FROM, or WHERE clauses.
Stored Procedures
Precompiled SQL code stored in the database, used to encapsulate complex logic.
List Comprehension
A concise way to create lists in Python using a single line of code.
DataFrame
A 2-dimensional labeled data structure in Pandas, akin to a spreadsheet or SQL table.
Key Performance Indicators (KPIs)
Quantifiable metrics used to evaluate the success of an organization or project.