1/24
Vocabulary flashcards for key terms and definitions from the Introduction to Data Science course book.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data Science
The combination of business, analytical, and programming skills that are used to extract meaningful insights from raw data.
Deep Learning
The application of computational networks (with cascading layers of units) to learning tasks.
Artificial Intelligence
A set of approaches to enable a computer to emulate and thus automate cognitive processes — often based on learning from data.
Machine Learning
A subset of artificial intelligence where mathematical models are developed to perform given tasks based on provided training examples.
Data Mining
This is the process of discovering patterns in large datasets.
Business Intelligence
This is a collection of routines that are used to analyze and deliver the business performance metrics.
Training Set
The dataset used by the machine learning model that will help it to learn its desired task.
Testing Set
These data are used to measure the performance of the developed machine learning model.
Outlier
A data record which is seen as exceptional and outside the distribution of the normal input data.
Data Cleansing
The process of removing redundant data, handling missing data entries and removing, or at least alleviating, other data quality issues.
Feature
An observable measure of the data. Other terms such as property, attribute, or characteristic are also used instead of feature.
Dimensionality Reduction
The process of reducing the dataset into lesser dimensions, ensuring that it conveys similar information.
Feature Selection
The process of selecting relevant features of the provided dataset.
Machine Learning
Algorithms or mathematical models that use information extracted from data in order to achieve a desired task or function.
Supervised Learning
The subset of Machine Learning that is based on labeled data. It can be further distinguished in regression and classification.
Unsupervised Learning
The subset of Machine Learning that is based on un-labeled data. Typical unsupervised learning tasks are clustering and dimensionality reduction.
Deep Learning
The application of networks of computational units with cascading layers of information processing used to learn through tasks.
Decision Model
A model assesses the relationships between the elements of provided data to recommend a possible decision for a given situation.
Regression
A forecasting technique to estimate the functional dependence between input and output variables.
Cluster Analysis
A type of unsupervised learning used to partition a set of data records into clusters. Records in a cluster are more similar to each other than to those in other clusters.
Classification
A machine learning approach to categorize entities into predefined classes.
Probability
Quantification of how likely it is that a certain event occurs, or the degree of belief in a given proposition.
Standard Deviation
A measure of how spread out the data values are.
Type I Error
False positive output, meaning that it was actually negative but has been predicted as positive.
Type II Error
False negative output, meaning that it was actually positive but has been predicted as negative.