Gradient Descent

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/10

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 1:29 PM on 3/4/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

11 Terms

1
New cards

Gradient Descent Idea

  1. sample input and target

  2. measure the error

  3. adapt the model to step where we descend to direction of low error

  4. repeat till low error is found

2
New cards

Parameter Update

knowt flashcard image
3
New cards

Gradient

knowt flashcard image
4
New cards

Dataset

  • each elem is a n dimensional vector x

  • prediction target y is a tensor which we called ground truth

5
New cards

Model

  • The model has a set of adaptable parameters, 𝜽∈𝚯, generally real numbers: 𝜽 in ℝ.

  • We write: A model with parameters 𝜃 is 𝑓𝜃:𝑋→𝑌

  • parameters control behaviour of the model

6
New cards

Learning Algorithm

  • params are adapted by loss fn

  • Goal to minimize loss functions

  • low loss=low error=high accuracy

7
New cards

Linear Regression

Goal: minimize the difference btw y (actual) and ŷ(prediction)

<p>Goal: minimize the difference btw y (actual) and <span><span>ŷ(prediction)</span></span></p><p></p>
8
New cards

Convex loss function

  • single global minimum

  • can be optimized much faster than with gradient descent

9
New cards

Learning Rate

  • determines how fast we adapt the parameters

  • high value= faster learning= risk of overshooting the minimum

  • low value=slower learning=hit minimum with accuracy

10
New cards

Logistic Regression

  • using regression for classification

  • idea: encode probability of belonging to a class as numeric probablity

  • fitting a (inverse) logistic function (aka sigmoid) to the data

  • based on the predicted value we assign a class

11
New cards

Gradient Descent use

  • applied to any differentiable model

  • train a linear/logistic regression model

  • SVMs

  • Neural networks

  • large language models