Chapter Two: Introduction to Data Science

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/9

Earn XP

Description and Tags

These flashcards cover key concepts from Chapter Two of the lecture on Data Science, addressing definitions, characteristics, and distinctions related to data and its processing.

Last updated 11:28 AM on 4/11/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

10 Terms

New cards

What is Data Science?

A multi-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured, semi-structured, and unstructured data.

New cards

What types of data representations are there?

Data can be represented as structured, semi-structured, or unstructured.

New cards

What is the Data Processing Cycle?

The set of operations used to transform data into useful information, including data collection, input, processing, output, and storage.

New cards

What defines structured data?

Data that adheres to a pre-defined data model and is straightforward to analyze, typically in a tabular format like Excel or SQL databases.

New cards

What characterizes unstructured data?

Data that does not have a predefined data model; typically text-heavy and may include audio, video files, and requires more complex processing methods.

New cards

What is Big Data?

Big data refers to large and complex datasets that are difficult to process using traditional data management tools and applications.

New cards

What are the four key characteristics of Hadoop?

Hadoop is economical, reliable, scalable, and flexible.

New cards

What is the importance of data trustworthiness in Big Data?

Data trustworthiness refers to the degree to which Big Data can be trusted, impacting its reliability for decision-making.

New cards

What are some application domains of Data Science?

Healthcare, marketing, finance, manufacturing, and social media are examples of application domains for data science.

New cards

What is the goal of the Big Data lifecycle?

To surface insights and connections from large volumes of heterogeneous data that are not achievable with conventional methods.