Machine Learning

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/20

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

21 Terms

1
New cards

is machine learning AI?

Machine learning is a subset of AI

2
New cards

what is hypermarameter

a parameter whose value must be set by the researcher before learning begins

3
New cards

what is labeled data

also called training data; used in supervised learning

4
New cards

unsupervised learning

1) does not use labeled data 2) a set up inputs (x) is used for analysis with no corresponding target (y) 3) the algorithm discovers underlying structure in data

5
New cards

dimension reduction

reducing the number of input, complexity of data. Kinda like if I eat and work I carpool and eat I have less to do

6
New cards

clustering

reducing data by categorizing them. Example, putting kids going to soccer and going to piano together and then drive them. * observation with a cluster are similar and different accross clusters.

<p>reducing data by categorizing them. Example, putting kids going to soccer and going to piano together and then drive them. * observation with a cluster are similar and different accross clusters.</p><p> </p>
7
New cards

deep learning

self teaching system which a computer learns from interacting with itself. can be supervised, unsupervised or reinforcement learning (train and error)

8
New cards

overfitting

the ML fits the training data too well, unable to generalize new data

<p>the ML fits the training data too well, unable to generalize new data </p>
9
New cards

descriobe in sample error

1) regarding training sample 2)bias arises from underfitted models

10
New cards

out of sample errors

1) prediction erros in validation and text sample 2) variance error from over fitted models

11
New cards

residual errors

results from randomness in the data

12
New cards

complexity reduction and cross validation solves

overfitting

13
New cards

a type of cross validation — K fold

1) split the data set into K number of sections/folds 2) first section is used to test the model, and the test are used to train model 3) this reduces the problem of holdout sample (data not used to train the sample). * usually K = 5 or 10 (sections)

14
New cards

random forests are a special case of

bagging ( a ensemble methods)

15
New cards

Random forest

large number of uncorrelated tress operating as a group outperform any of the individual consistent tress (wisdom of crowds)

16
New cards

what is good for factor based investment strategies

random forest

17
New cards

Dimension reduction and clustering are examples of supervised or unsupervised machine learning

unsupervised

18
New cards

lower dimensional dataset benefit

reduce overfitting, easy to train and interpret

19
New cards

divisive clustering is a

top down approach , hierachrial is bottom up approach

20
New cards

which model to use “In some cases, their fear of loss seems to increase at an increasing rate when some scenarios are presented.”

the relationship is not linear, so neural net work

21
New cards

if the out put of data is not specified

unsupervised learning