Intro to AI

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/33

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

34 Terms

1
New cards

attribute

quality describing an observation - ex: color

2
New cards

feature

an attribute and value combination ex: color is blue

3
New cards

observation/instance

a datapoint or sample in dataset

4
New cards

training set

a set of instances used to train a ML model

if supervised its X and Y, if unsupervised just X

5
New cards

test set

set of instances used at the end of model training and validation to assess the predictive power of model

6
New cards

random variable

unknown value that follows certain probability distribution ex: X ~ N(0,1) with 0 as mean and 1 as the variance, divided into discrete (countable) summation and continuous integral

7
New cards

sum squared error

Sum of (yi - wxi)2 = wx

8
New cards

Four things required for ML

  1. data

  2. model

  3. optimization

  4. goal (objective function, optimal loss function ex: SSE)

9
New cards

Bias term regression - how do you determine the function

same way as SSE, use least squares to determine w0 (y-int) and w1 (slope) deriving in terms of that variable

10
New cards

Linearity Assumption

forces the predictor to be a linear combination of features (the function can be approximated by linear/constant shape)

11
New cards

homoscedasticity

variance of the residual error is assumed to be constant over the entire feature space

12
New cards

independence

assumed that each instance is independent of any other instance

  • independence of each instance sample, close but diff from multicollinearity

13
New cards

fixed features

input features are considered “fixed” - they are treated like “given constants” and are not random variables (no pdf or pmf) this implies they are free of measurement errors

14
New cards

absence of multicollinearity

can’t have strongly correlated features of an instance, b/c if 2 features are strongly correlated it means that there are infinitely many solutions and can’t solve uniquely

  • independence of features within a vector

15
New cards

General Linear Regression

used when can’t use linear function to estimate, and need a ply

16
New cards
term image
knowt flashcard image
17
New cards

Phi(x)

basis function - common ones are polynomial

18
New cards

polynomial basis function p =1 and phii(x) = x

GLR reduces to univariate linear regression, d = 1

19
New cards

polynomial basis function p = 1 and phii(x) = x (feature vector)

GLR reduces to multivariate linear regression

20
New cards

polynomial basis function p = 2 and phii(x) = x2

GLR reduces to univarate polynomial order 2 regression, so d = 1

21
New cards

Number of parameters of regression GLR

grows exponentially with respect to polynomial order (p) and feature dimension (d) - (d+p)! / (d! p!)

22
New cards

multivariate

means that feature vector (d) > 1 means that there are more than one feature in the vector X

23
New cards

univariate

one feature in feature vector x

24
New cards

why is the correlation coefficient matrix important

when designing your feature vector, you know which features may contribute to multicollinearity and weaken your analysis

25
New cards

how do you determine if you can use left / right inverse

rank !!! left inverse rank = of phi T * phi = k + 1

right inverse rank = phi * phi-1 = n

don’t use coefficient correlation matrix, that’s for heuristic analysis, and further analysis

26
New cards

what does a singular matrix imply

there are infinitely many solutions, however the model and parameters are still solvable GLR, but the parameters are not unique

27
New cards

causes of no left inverse

multicollinearity, or perfectly correlated features (problem between features)

28
New cards

causes of no right inverse

dependent samples, or perfect correlations among SAMPLES (problem between samples)

29
New cards

rank deficient

rank is less than the rank if the matrix was independent, and invertible

30
New cards

pros and cons of pseudo inverse solution

+: fast, analytic (close form) solution

-: expensive if both k and n are large

multicollinearity issue

31
New cards

row rank issue

independence between observations

32
New cards

col rank issue

multicollinearity between features

33
New cards

calc coeff matrix

don’t include bias column

34
New cards

calc rank