Ch 1 fundamentals of data science

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/22

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

23 Terms

1
New cards

querying and reporting 

you know exactly what you are looking for

(no modeling or pattern finding) 

2
New cards

OLAP : online analytical processing

is a scale up of querying

GUI to query large data collections in real-time

pre-programmed dimensions of analysis

summary level 

→ no modeling or pattern finding 

(still predetermined what you are looking for) 

3
New cards

data science

a set of fundamental principles that guide extraction of knowedge from data

4
New cards

data mining

the extraction of knowledge from data, via technologies that incorporate the principles of data science

→ you don’t know what you are looking for/ want to find new intricate patterns in the (big) data

5
New cards

big data

dtaa that is so large that traditional data storage and processng systems are unable to deal with it

velocity, volume and variety

6
New cards

techniques that allow machines to display intelligent behavior

artificial intelligence 

<p>artificial intelligence&nbsp;</p>
7
New cards

machine learning

subset of AI technioques that improve with data

= learning by doing 

8
New cards

deep learning

subset of machine learning which uses neural network technology

9
New cards

data

= raw stream of facts

structured or unstructured 

can sometimes be ‘big’= information / knowledge 

10
New cards

a model

an (abstract) representation of (a part) of reality

11
New cards

how is a model learned / trained

by machine learning algorithm, based on data

12
New cards

prediction

the estimation of an unknown value

13
New cards

types of machine learning

supervised learning

unsupervised learning 

reinforcement learning 

14
New cards

supervised learning

learning a mapping X→ Y or f(x) = y

  • y is the outcome / target / label

  • dependent on the type of y

  • - classification (discrete) or regression (continuous)

prediction := estimation of an unknown value

15
New cards

supervised learning : regression 

continuous target variable 

eg ; linear regression 

<p>continuous target variable&nbsp;</p><p>eg ; linear regression&nbsp;</p>
16
New cards

supervised learning : classification

binary categorical target varable 

  • binary classification 

  • binary outcome 

categorical target variable : - multiclass classification 

output can also be a probability of class membership 

17
New cards

unsuppervised learning

no “Y” target

  • clustering 

  • anomaly detection 

  • generative models 

18
New cards

unsupervised learning : clustering

assigning similar observations to clusters Ie groups of similar obsevations

19
New cards

unsupervised learning : anomaly detection

detecting anaomalous os-bservations

ex 

  • identification of fraudulent transactions 

20
New cards

unsupervised learning : generative models

generation of realistic new observations

21
New cards

reinforcement learning

no data set 

learning through interaction with the environment (exploration, exploitatiion) 

learning a policy that optimizes, a reward

22
New cards

why so few reinforcement learning in the industry

  • long tails (ex self-driving cars) 

  • expensive exploration 

  • immense dimensionality of problems, need many trials, no simulators

  • instability : catastrophic forgetting

  • trust 

23
New cards

CRISP - DM

cross industry standard process for data minijng

structured process of problem to solution (iterative) 

  • bsuiness understanding 

  • data understanding 

  • data preparation 

  • modeling 

  • evaluation 

  • deployment 

<p>cross industry standard process for data minijng </p><p>structured process of problem to solution (iterative)&nbsp;</p><ul><li><p>bsuiness understanding&nbsp;</p></li><li><p>data understanding&nbsp;</p></li><li><p>data preparation&nbsp;</p></li><li><p>modeling&nbsp;</p></li><li><p>evaluation&nbsp;</p></li><li><p>deployment&nbsp;</p></li></ul><p></p>