Machine Learning

0.0(0)

Studied by 1 person

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/20

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

21 Terms

New cards

is machine learning AI?

Machine learning is a subset of AI

New cards

what is hypermarameter

a parameter whose value must be set by the researcher before learning begins

New cards

what is labeled data

also called training data; used in supervised learning

New cards

unsupervised learning

1) does not use labeled data 2) a set up inputs (x) is used for analysis with no corresponding target (y) 3) the algorithm discovers underlying structure in data

New cards

dimension reduction

reducing the number of input, complexity of data. Kinda like if I eat and work I carpool and eat I have less to do

New cards

clustering

reducing data by categorizing them. Example, putting kids going to soccer and going to piano together and then drive them. * observation with a cluster are similar and different accross clusters.

<p>reducing data by categorizing them. Example, putting kids going to soccer and going to piano together and then drive them. * observation with a cluster are similar and different accross clusters.</p><p> </p>

New cards

deep learning

self teaching system which a computer learns from interacting with itself. can be supervised, unsupervised or reinforcement learning (train and error)

New cards

overfitting

the ML fits the training data too well, unable to generalize new data

New cards

descriobe in sample error

1) regarding training sample 2)bias arises from underfitted models

New cards

out of sample errors

1) prediction erros in validation and text sample 2) variance error from over fitted models

New cards

residual errors

results from randomness in the data

New cards

complexity reduction and cross validation solves

overfitting

New cards

a type of cross validation — K fold

1) split the data set into K number of sections/folds 2) first section is used to test the model, and the test are used to train model 3) this reduces the problem of holdout sample (data not used to train the sample). * usually K = 5 or 10 (sections)

New cards

random forests are a special case of

bagging ( a ensemble methods)

New cards

Random forest

large number of uncorrelated tress operating as a group outperform any of the individual consistent tress (wisdom of crowds)

New cards

what is good for factor based investment strategies

random forest

New cards

Dimension reduction and clustering are examples of supervised or unsupervised machine learning

unsupervised

New cards

lower dimensional dataset benefit

reduce overfitting, easy to train and interpret

New cards

divisive clustering is a

top down approach , hierachrial is bottom up approach

New cards

which model to use “In some cases, their fear of loss seems to increase at an increasing rate when some scenarios are presented.”

the relationship is not linear, so neural net work

New cards

if the out put of data is not specified

unsupervised learning