Data Science Exam Review Flashcards

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/11

Earn XP

Description and Tags

A set of flashcards designed to help students prepare for their Data Science exam, covering key concepts and definitions.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

12 Terms

New cards

What is Data Science?

A multidisciplinary field that uses various techniques, algorithms, processes, and systems to extract insights and knowledge from structured and unstructured data.

New cards

What are the main types of data?

Quantitative data (measurable quantities) and Qualitative data (characteristics and descriptors that cannot be easily measured).

New cards

What is the purpose of data preprocessing?

To clean, transform, and integrate data to make it suitable for analysis, ensuring high data quality.

New cards

Define supervised learning.

A type of machine learning where algorithms learn from labeled input data to make predictions or decisions.

New cards

What is the difference between accuracy and precision in model evaluation?

Accuracy is the ratio of correctly predicted observations to total observations, while precision is the ratio of correctly predicted positive observations to all predicted positives.

New cards

What is the purpose of a confusion matrix?

To visualize the performance of a classification algorithm by showing true positives, true negatives, false positives, and false negatives.

New cards

Explain exploratory data analysis (EDA).

A process to analyze datasets to summarize their main characteristics, often using visual methods.

New cards

What is feature scaling?

A technique used to standardize the range of independent variables or features of data, essential for algorithms sensitive to the scale of data.

New cards

What is the significance of the F1 score?

The F1 score is the harmonic mean of precision and recall, providing a balance between the two metrics, particularly in imbalanced datasets.

New cards

What is hyperparameter tuning?

The process of choosing a set of optimal hyperparameters for a learning algorithm to improve the model's performance.

New cards

What is the role of decision trees in machine learning?

Decision trees are used for classification and regression tasks, providing a model that predicts the value of a target variable based on several decision rules inferred from the data features.

New cards

Define principal component analysis (PCA).

A dimensionality reduction technique used to reduce the number of features in a dataset while preserving as much variance as possible.