APCQ: 5.2 Exploring Big Data

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/20

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 4:46 AM on 4/16/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

21 Terms

1
New cards

Big Data

Datasets that are too large or complex for traditional data processing applications to handle.

2
New cards

The 3 V's of Big Data

Volume (amount), Velocity (speed of generation), and Variety (types of data like text, video, and audio).

3
New cards

Structured Data

Data that fits into fixed fields or tables (e.g., spreadsheets, CSV files).

4
New cards

Unstructured Data

Data that does not have a predefined model (e.g., social media posts, videos, raw sensor data).

5
New cards

Information vs. Data

Data is the raw facts; Information is the knowledge or patterns extracted after processing that data.

6
New cards

Scalability

The ability of a system to maintain performance and handle growth as the amount of data increases.

7
New cards

Sequential Processing

A method where tasks are completed one after another in an ordered sequence; slower for Big Data.

8
New cards

Parallel Processing

Splitting a large task into smaller parts that are processed simultaneously by multiple processors to save time.

9
New cards

Distributed Systems

A network of independent computers that work together as a single system to process massive datasets.

10
New cards

Cloud Computing

Using remote servers hosted on the internet to store and process data rather than a local server or PC.

11
New cards

Data Cleaning

The process of fixing or removing incomplete, duplicate, or incorrectly formatted records to ensure accuracy.

12
New cards

Data Filtering

Narrowing down a dataset to a specific subset based on certain criteria (e.g., only looking at data from 2024).

13
New cards

Classification

A data mining technique that assigns data into predefined categories (e.g., sorting emails into 'Spam' or 'Inbox').

14
New cards

Clustering

Grouping similar data points together without pre-existing labels to find natural patterns.

15
New cards

Data Visualization

Using charts or graphs to help humans identify trends or patterns in processed data.

16
New cards

Correlation

A statistical relationship where two variables move together, but one does not necessarily cause the other.

17
New cards

Causation

A relationship where one event or variable is the direct result of the other.

18
New cards

Digital Divide

The gap between those who have access to modern technology/internet and those who do not; often leads to Data Bias.

19
New cards

Bias in Data

When the data collection method excludes certain groups, leading to results that don't accurately represent the whole population.

20
New cards

Re-identification

The process of matching anonymous data with other available information to discover an individual's identity (a major privacy risk).

21
New cards

Open Data

Publicly available datasets that anyone can access, use, and share (often used by 'Citizen Scientists').