1/110
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
data analytics
The process of analyzing data to extract insights
descriptive analytics
A type of data analytics that summarizes past data
predictive analytics
The process of predicting future outcomes
prescriptive analytics
The process of recommending actions based on data
ETL
Extract, Transform, Load
big data
Data that is too large and complex to be processed by traditional data-processing software
data mining
To extract useful information from large datasets
unstructured data
A collection of social media posts
data warehouse
A system used for reporting and data analysis
regression analysis
A common technique used in predictive analytics
structured data
It is stored in a fixed format
data mart
To provide a subset of data for a specific business function
data visualization tool
An example is Tableau
data analytics process
The first step is data collection
data lake
A storage repository that holds a vast amount of raw data in its native format
descriptive analytics goal
To summarize past data
data cleaning technique
Data transformation
data analyst role
To analyze data and extract insights
business intelligence tool
An example is Power BI
data governance goal
To ensure data quality and compliance
cloud-based analytics benefit
Reduced data storage costs
data dashboard purpose
To provide a visual representation of key performance indicators
machine learning algorithm
An example is Linear regression
real-time analytics advantage
Faster decision-making
missing data handling method
Replacing missing values with the mean or median
hypothesis test purpose
To make inferences about a population based on sample data
categorical variable example
Gender
exploratory data analysis goal
To summarize the main characteristics of the data
statistical test for means
T-test
p-value indication
The probability of observing the data given that the null hypothesis is true
measure of central tendency
Mean
correlation coefficient purpose
To measure the strength and direction of the relationship between two variables
continuous variable example
Temperature
regression analysis purpose
To predict the value of a dependent variable based on one or more independent variables
Summarizing past data
To summarize past data
Predicting dependent variable
To predict the value of a dependent variable based on one or more independent variables
Cleaning and organizing data
To clean and organize data
Storing data securely
To store data securely
Healthcare application of data science
Predicting patient outcomes
Recommendation system in e-commerce
To predict customer preferences and suggest products
Finance application of data science
Predicting stock prices
Sentiment analysis in social media
To analyze and interpret the emotions expressed in text
Marketing application of data science
Predicting customer churn
Natural language processing (NLP)
To analyze and interpret human language
Transportation application of data science
Predicting traffic patterns
Anomaly detection in cybersecurity
To identify unusual patterns that may indicate a security breach
Retail application of data science
Predicting inventory needs
Machine learning in data science
To develop algorithms that can learn from and make predictions on data
Key principle of effective data visualization
Keeping the visualization simple and clear
Purpose of a scatter plot
To compare the relationship between two variables
Categorical data visualization
Bar chart
Advantage of using a dashboard
It allows for interactive exploration of data
Data storytelling
The practice of using data visualizations to convey a narrative
Purpose of a heat map
To display data density or intensity
Common tool for creating data visualizations
Tableau
Goal of data communication
To effectively convey insights and findings from data analysis
Example of a time series visualization
Line chart
Purpose of a pie chart
To display the proportion of categories within a whole
Type of supervised learning
Regression
Goal of unsupervised learning
To find hidden patterns or intrinsic structures in data
Example of a classification algorithm
Decision tree
Purpose of a confusion matrix
To evaluate the performance of a classification model
Common technique in dimensionality reduction
Principal component analysis (PCA)
Advantage of ensemble methods
They improve the accuracy and robustness of predictions
Example of a clustering algorithm
K-means
Purpose of cross-validation
To evaluate the performance of a model
Characteristic of a neural network
It is based on the structure and function of the human brain
Goal of reinforcement learning
To learn optimal actions through trial and error
Common evaluation metric for regression models
Mean Squared Error (MSE)
Purpose of feature scaling
To normalize the range of independent variables
Example of a supervised learning algorithm
Support vector machine (SVM)
Predict continuous values
To predict continuous values
Classify data
To classify data into distinct categories
Reduce dimensionality
To reduce the dimensionality of data
Visualize data
To visualize data
Prevent overfitting
Common method for preventing overfitting in machine learning models
Learning rate
To control how much the model's weights are adjusted with respect to the loss gradient
Unsupervised learning algorithm
Example of an unsupervised learning algorithm is K-means clustering
Random forest advantage
It reduces the risk of overfitting by averaging multiple decision trees
Activation function
Common activation function used in neural networks is Sigmoid
Validation set purpose
To tune the model's hyperparameters
Feature selection technique
Common technique for feature selection is Principal component analysis (PCA)
Convolutional neural network goal
To analyze and interpret images
Reinforcement learning algorithm
Example of a reinforcement learning algorithm is Q-learning
Dropout purpose
To prevent overfitting by randomly dropping units during training
Evaluation metric for classification
Common evaluation metric for classification models is Accuracy
Support vector machine advantage
It works well with high-dimensional data
Handling imbalanced datasets
Common technique for handling imbalanced datasets is Oversampling the minority class
Confusion matrix purpose
To evaluate the performance of a classification model
Deep learning framework
Example of a deep learning framework is TensorFlow
Generative adversarial network goal
To generate new data samples that are similar to the training data
Model selection technique
Common technique for model selection in machine learning is Cross-validation
Learning curve purpose
To evaluate the performance of a model over time
Boosting algorithm example
Example of a boosting algorithm is AdaBoost
Hyperparameter tuning goal
To optimize the performance of a model
Reducing overfitting technique
Common technique for reducing overfitting in neural networks is Using dropout
ROC curve purpose
To evaluate the performance of a classification model
Regularization technique example
Example of a regularization technique is Lasso regression
Convolutional neural network advantage
It is specifically designed to process grid-like data such as images
Evaluation metric for clustering
Common evaluation metric for clustering algorithms is Silhouette score