AP CSP Unit 5

studied byStudied by 4 people
0.0(0)
Get a hint
Hint

Big Data

1 / 16

17 Terms

1

Big Data

It is very large and complete so it can be hard to process using standard techniques → this is why we use algorithms, data analysis, and data mining

It's commonly believed otherwise BUT data is always collected just not always utilized    (common misconception: if data is being collected it’s being utilized) 

New cards
2

Usable Data

Can use data regardless of whether it’s informative or useful

You can get info out of it

Do the means to process or analyze data exit? (you can collect data and just not have the proper technology to process it yet, making it unusable)

Not always accurate, falls in a range of possibility

Just because something is usable, doesn’t make it useful

New cards
3

Useful data

Does someone want to use it (adds value)? 

Allows user to accomplish a task or directive

Dependent on who wants the data

Directly related to what one can do with the data

Often times things that are useful are also useable

Usefulness is open to interpretation and context

New cards
4

Data Processing: Data sets can be difficult to process if..

  • Data is incomplete

  • Data is invalid

  • They need to combine data sources

  • They need to clean data (finding common links like abbreviations, spelling and capitals and replace it with the same work

  • Bias in the source or program

  • Size - very big sets are hard to process and take a lot of time (here a parallel system may be a solution)

New cards
5

Spiders

Kind of programs used to process data and acquire information

New cards
6

Descriptive Analysis

  • Summarizes and describes data

  • Doesn't provide any conclusions yet

  • Data visualization is often used to present information in form of graphs, tables, and charts

  • Data may be used to help an organization with future plans

  • Answers the question “ What happened?”

New cards
7

Descriptive Analysis Advantages

  • Reveals previously hidden patterns

  • Helps businesses communicate information among departments and people outside the company

New cards
8

Descriptive Analysis Disadvantages

  • Cannot tell you anything about relationships or causes of effects

  • Can do math but doesn’t draw conclusions from them 

New cards
9

Predictive Analysis

  • Looks are current and historical data patterns to see if they’re going to show up again 

  • This data analysis makes predictions 

  • Data may be used to allow an organization, business, or investor adjust and rework their future plans

  • Answers the question “What might happen in the future?”

New cards
10

Predictive Analysis Advantages

  • Key player in search advertising and recommendation engines

  • Can provide managers with tools to influence upselling, manufacturing optimization and even new product development

New cards
11

Predictive Analysis Disadvantages

  • Circuits argue computers fail to consider all variables even when it has sufficient data

  • Customer behavior is bound to change with time so the model would need to be repeatedly 

New cards
12

Prescriptive Analysis

  • Using data to determine an optimal path or actions

  • Statistically conceded (proven to be true) by relevant factors 

  • Gives recommendations for the future

  • Answers the question “What should we do next?”

New cards
13

Prescriptive Analysis Advantages

  • By simulating and running different scenarios of sudden shifts, you can find the best way to respond to the shifts quickly

  • Reduces risk and minimizes fraud

New cards
14

Prescriptive Analysis Disadvantages

  • Requires large amounts of data

  • Results aren’t always accurate

  • High computing power is required

  • Not completely reliable for long term solutions

New cards
15

Data Mining

  • Process of sorting through large data sets to identify patterns 

  • These patterns can help solve business problems

  • It’s a crucial part of business as it helps with strategizing and managing

  • It can even detect fraud and reduce risk

New cards
16

Data Mining Strategies Used

  • Association rules: searches for a relationship between variables → provides additional value within the data set as it links data

  • Classification: uses predefined classes to assign to objects. The classes describe the similarities between data points → allows for better summarization and categories

  • Clustering: similar to classification but it not only does similarities but also groups by differences. It can provide more general topics whereas classification is more specific

  • Predictive Analysis: uses historical information to build graphical or mathematical models to forecast future outcomes. This overlaps with regression analysis. 

  • Anomaly/outlier detection: it identifies rare or unusual events or an item that differs significantly from standard patterns

  • Regression: used to predict the range of data in a dataset. Is sometimes seen as a line of best fit to see where actual data compares to ir

  • Summarization: Used to find a compact description of the a dataset → provide a general categorization

New cards
17
New cards

Explore top notes

note Note
studied byStudied by 2525 people
... ago
5.0(3)
note Note
studied byStudied by 36 people
... ago
5.0(5)
note Note
studied byStudied by 8 people
... ago
5.0(1)
note Note
studied byStudied by 91 people
... ago
5.0(2)
note Note
studied byStudied by 25 people
... ago
5.0(1)
note Note
studied byStudied by 5 people
... ago
5.0(1)
note Note
studied byStudied by 37 people
... ago
5.0(2)
note Note
studied byStudied by 37 people
... ago
5.0(2)

Explore top flashcards

flashcards Flashcard (76)
studied byStudied by 5 people
... ago
5.0(1)
flashcards Flashcard (32)
studied byStudied by 14 people
... ago
5.0(2)
flashcards Flashcard (57)
studied byStudied by 3 people
... ago
5.0(1)
flashcards Flashcard (24)
studied byStudied by 19 people
... ago
5.0(1)
flashcards Flashcard (81)
studied byStudied by 2 people
... ago
5.0(1)
flashcards Flashcard (277)
studied byStudied by 55 people
... ago
5.0(1)
flashcards Flashcard (20)
studied byStudied by 5 people
... ago
5.0(1)
flashcards Flashcard (40)
studied byStudied by 50 people
... ago
5.0(1)
robot