1/15
Flashcards covering key vocabulary and concepts in data analytics fundamentals.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data
A collection of facts.
Quantitative Data
Data that is discrete (counted) or continuous (measured).
Qualitative Data
Data that is descriptive, such as names or colors.
Triple Bottom Line (TBL)
A framework with a focus on people, planet, and profit.
Raw Data
Unprocessed data such as Airbnb host IDs, neighborhood data, or review data.
Data Preprocessing
Process data through integration, transformation, and loading.
Label Encoding
Encoding categorical classes with numbers, but be cautious if there is no order relationship.
One-Hot Encoding
Recommended for classes with no clear ranking.
Scaling & Normalizing
Adjusting data to a specific range to improve the shape of distribution (e.g., min-max scaling).
K-means
Works with edge sets; sum of the squared distance between points and centroid.
Regularization
A technique to prevent overfitting.
K-fold Cross Validation
A process to split data into K equal folds, use K-1 for training, and 1 for testing; repeat K times.
Descriptive Analysis
Includes total units, distributions, and price statistics.
Diagnostic Analysis
Involves removing outliers.
Predictive Analysis
Using regression or classification to make predictions.
Recommendation Systems
Based on item matrix and user matrix to find preference of users with similar taste.