APCompSci - Big Data - Unit 5

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/22

flashcard set

Earn XP

Description and Tags

APCompSci - Big Data - Unit 5

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

23 Terms

1
New cards
Big Data
Data set that. is so large that traditional processing is inadequate to process it
2
New cards
Metadata
Data about other data (mean,median,mode, standard deviation, etc.)
3
New cards
Usability
Degree to which data can possibly or reasonably be used
4
New cards
Usefulness
Degree to which someone would want to use data, the data has value to some purpose.
5
New cards
Privacy
Protecting useful digital data from being given to or used by others
6
New cards
Utility
Sharing personal digital data to receive something of value in return
7
New cards
Confidence
Degree to which integrity of data is complete
8
New cards
Data Persistence
Digital information has the tendency to not be deleted because of the use of copying information
9
New cards
Structured Data
Data that is identifiable because it is organized in a structure, requires less storage
10
New cards
Unstructured Data
Data collected in "raw" form, but connections and relationships among parts of the data is harder to trace and slower to process
11
New cards
Data Mining
Extract information from a data set using various techniques to discover patterns of the information
12
New cards
Association Rule Mining
Link between one set of items and another set: instances of which the appearance of items implies that another set of items will also appear
13
New cards
Anomaly Detection
The identification of unusual data records that may be interesting or simply data errors and require further investigation (outlier data)
14
New cards
Clustering
Task of discovering groups and structures in the data that are in some way or another similar without using known structures in the data
15
New cards
Classification
Task of generating known structure to apply to new data
16
New cards
Summarization
Providing a more compact representation of a data set including visualization and report generation. Ex: Wordle,WordItOut
17
New cards
Regression
Attempt to find a function that models the data with the least error
18
New cards
ReCAPTCHA
Revision of Completely Automated Public Turing test to tell Computers and Humans apart; digital tool used to deter automated form-filling
19
New cards
Crowdsourcing
A sufficiently large sample size of individuals asked to estimate an unknown result, the "wisdom of the crowd" phenomenon
20
New cards
Screen Scraping
The conversion of data formatted for human use to a format more easily used by automated computer processes
21
New cards
Descriptive Analytics
Analytics that provide information about collected data via. statistics like mean, median, and mode
22
New cards
Predictive Analytics
Analytics that provide information about future events based on previously collected and analyzed data
23
New cards
Prescriptive Analytics
Analytics that provide information to maximize the chances of a future event occurring based on comparing the predictive analyses of multiple options