Data Science and Engineering Concepts

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/15

flashcard set

Earn XP

Description and Tags

This flashcard set covers key vocabulary and concepts related to data science and engineering, including definitions of important terms and methodologies.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

16 Terms

1
New cards

Data Science

A field that utilizes techniques and algorithms to uncover hidden patterns and trends in large amounts of data.

2
New cards

5Vs of Big Data

Volume, Variety, Veracity, Validity, and Velocity; key characteristics that describe challenges in handling big data.

3
New cards

Structured Data

Data that is well-defined and typically stored in tabular formats with a clear relationship between rows and columns.

4
New cards

Unstructured Data

Raw data in various formats such as images, audio, and text that lacks a pre-defined model.

5
New cards

Semi-Structured Data

Data that contains both structured and unstructured components, such as emails.

6
New cards

Data Quality

A measure of the condition of a data set, defined by aspects like accuracy, completeness, and consistency.

7
New cards

CRISP-DM Model

A data science process model consisting of six iterative steps: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.

8
New cards

Quantitative Data

Data that can be expressed numerically and is divided into discrete and continuous types.

9
New cards

Qualitative Data

Data that describes characteristics or qualities, often represented through categories rather than numbers.

10
New cards

Data Cleaning

The process of identifying and correcting errors or inconsistencies in data to improve its quality.

11
New cards

Outlier

A data point that deviates significantly from other observations, which may indicate measurement error or novel insights.

12
New cards

Data Engineering

The field focused on the development and maintenance of systems that gather and process data for analysis.

13
New cards

Dimensionality Reduction

Techniques used to reduce the number of variables in a data set while maintaining its essential properties.

14
New cards

Data Transformation

The process of converting data into a different format or structure to meet specified requirements.

15
New cards

Machine Learning

A technique within data science that enables computers to learn from data and improve their performance over time.

16
New cards

Data Visualization Techniques

Methods for presenting data in graphical formats to help convey insights and findings effectively.