1/32
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data Literacy
Understanding and interpreting data effectively.
Causality
Establishing a cause-and-effect relationship.
Association
Identifying correlations between variables.
Observational Studies
Research analyzing existing data without manipulation.
Chocolate and Health
Example of association in health studies.
Death Penalty and Murder Rates
Example of potential causal analysis.
Public Data
Freely available datasets for experimentation.
Existing Product Data
User interaction data from current products.
Human-in-the-Loop Systems
Combining automation with human oversight.
Brute Force Collection
Costly data gathering methods for unique datasets.
Purchased Data
Acquired datasets from third-party vendors.
Filtering Impurities
Managing errors in raw data for quality.
Merging Diverse Data Sources
Integrating datasets from different origins.
Data Labeling
Annotating data for machine learning context.
External Services
Platforms for scalable data annotation tasks.
Internal Teams
In-house capabilities for data annotation.
User-Generated Labels
User contributions to data labeling processes.
Annotation Acceleration Tools
Technologies enhancing data annotation efficiency.
Data Science
Field combining statistics
Collaboration in Data Science
Teamwork essential for solving complex data problems.
Skills of Data Scientists
Mix of statistics
5 C's of Data Ethics
Consent
IBM Data Estimate
2.5 quintillion bytes of data generated daily.
Prediction in Data Science
Forecasting events based on data analysis.
Productivity Paradox
Technological shifts may delay visible economic benefits.
Big Data
Large datasets requiring responsible usage for impact.
Human Limitations
Memory and objectivity constraints in data interpretation.
Python Basics
Fundamental functions for data manipulation in Python.
NumPy
Library for numerical operations in Python.
Pandas
Data manipulation library for structured data.
PyPlot
Matplotlib module for creating visualizations.
Seaborn
Statistical data visualization library based on Matplotlib.
Linear Regression
Predicting continuous variables using linear relationships.