My CSE 40 Machine Learning Study Guide

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/48

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

49 Terms

1
New cards

Deductive Reasoning

Reasoning in which a conclusion is guaranteed. It begins by introducing a general guideline or principle, and then, based on that, leads to a clear and specific conclusion. If the original assertions are true then the conclusion must be true. With this reasoning we can make observations and expand implications, we cannot make predictions about the future or otherwise non-observed phenomena.

<p>Reasoning in which a conclusion is guaranteed. It begins by introducing a general guideline or principle, and then, based on that, leads to a clear and specific conclusion. If the original assertions are true then the conclusion must be true. With this reasoning we can make observations and expand implications, we cannot make predictions about the future or otherwise non-observed phenomena.</p>
2
New cards

Inductive Reasoning

This reasoning begins with observations that are specific and limited in scope and proceeds to a generalized conclusion that is likely, but not certain, in light of accumulated evidence. Much scientific research is carried out by this reasoning: gathering evidence, seeking patterns, and forming a hypothesis or theory to explain what is seen. While this reasoning cannot yield an absolutely certain conclusion, it can actually increase human knowledge (it is ampliative). It can make predictions about future events or as-yet unobserved phenomena.

<p>This reasoning begins with observations that are specific and limited in scope and proceeds to a generalized conclusion that is likely, but not certain, in light of accumulated evidence. Much scientific research is carried out by this reasoning: gathering evidence, seeking patterns, and forming a hypothesis or theory to explain what is seen. While this reasoning cannot yield an absolutely certain conclusion, it can actually increase human knowledge (it is ampliative). It can make predictions about future events or as-yet unobserved phenomena.</p>
3
New cards

Abductive Reasoning

This reasoning typically begins with an incomplete set of observations and proceeds to the likeliest possible explanation for the set. "Taking your best shot"

4
New cards

What type of reasoning is this example:

All dogs have ears, golden retrievers are dogs therefore golden retrievers have ears.

Deductive Reasoning

5
New cards

What type of reasoning is this example:

All swans we've seen are white, therefore all swans are white.

Inductive Reasoning

6
New cards

What type of reasoning is this example:

Sarah goes to a local park every day during her lunch break. Over the course of several weeks, she notices the following:

-On Monday, she sees that all the birds she observes near the pond are ducks.

-On Tuesday, she again notices that every bird she sees near the pond is a duck.

-On Wednesday and Thursday, the pattern continues—only ducks are present near the pond.

She concludes that all birds that hang out near the pond in this park are ducks.

Inductive Reasoning

7
New cards

What type of reasoning is this example:

All humans are mortal and Socrates is a human. Therefore Socrates is mortal.

Deductive Reasoning

8
New cards

What type of reasoning is this example:

There is heavy traffic so there is probably an accident ahead.

Abductive Reasoning

9
New cards

What type of reasoning is this example:

eStudent has had coughing symptoms, breathing problems, and is generally ill. Therefore, eStudent is likely COVID-19 positive.

Abductive Reasoning

10
New cards

What type of reasoning is Machine Learning?

Inductive Reasoning

11
New cards

What is Machine Learning?

When we say machines "learn," what we really mean is that they find a math formula that works well with a certain set of data (the "training data"). This formula helps the machine produce the right results. And, if we give the machine new data that's similar to the training data, the formula should still give us the right results. But remember, machines aren't really learning like humans do. They're just using math to make predictions.

12
New cards

Supervised Learning

In this learning style, the algorithm is trained on a labeled dataset. This means that each example in the training dataset is paired with the correct output. Consists of the algorithm making predictions and then being corrected by the labeled data whenever it's wrong. Used for problems such as classification: The output variable is a category, such as "spam" or "not spam", "fraudulent" or "valid", etc

13
New cards

Unsupervised Learning

In this learning style, the algorithm is given input data without any explicit output or labels. The goal is to discover patterns, structures, or relationships within the data. It's used when one wants to derive structure from data without having predefined labels. If you had data about customer shopping habits, this learning algorithm could group customers with similar habits without knowing any predefined categories.

14
New cards

Reinforcement Learning

In this learning style, an algorithm (often called an "agent") interacts with an environment and learns to make decisions by receiving feedback in the form of rewards or penalties. It's not provided with the correct answer but must discover it by trying out different actions and observing the outcomes. It's used in situations where an agent needs to learn how to behave in an environment by performing certain actions and receiving rewards as feedback. Training a computer program to play a game like chess or Go. The program makes a move (action), the game progresses (new state), and the program receives a reward or penalty depending on the outcome of the move.

15
New cards

What type of learning is this example:

eProf has information about students in her class from previous

quarters - their attendance, their grades on HO & WA, and scores

on quizzes, and whether they got an A on the final.

For students this quarter, given attendance, grades on HO & WA,

and scores on quizzes, she would like to predict whether a student

will get an A on the final

Supervised Learning

16
New cards

What type of learning is this example:

eProf has examples of research papers she has written with

students from her research group, including features such as how

novel the theoretical results are, how exciting the experimental

results are, how good the title is, how well-written the paper is,

and labels, whether or not the paper was accepted or rejected.

She would like to predict whether a new paper will be accepted or

rejected.

Supervised Learning

17
New cards

What type of learning is this example:

eProf did a survey to ask to about the interests and background of

students taking her class.

She is trying to find patterns in their background and interests (so

she can make more compelling examples!)

She would like to find groups (or clusters) of interests

Unsupervised Learning

18
New cards

What type of learning is this example:

eProf gets evaluated based on the SETS at the end of the quarter.

Throughout the quarter, she can take actions (give lectures, make

assignments and quizzes), and gets indirect feedback such as how

many students fall asleep, student comments on EdDiscussion,

etc.

She would like to optimize her actions so that her SETS are good

(and her students learn, :>).

Reinforcement Learning

19
New cards

The focus of this class will be:

Supervised Learning

20
New cards

Python Pandas

In 2008, developer Wes McKinney started developing

pandas when in need of high performance, flexible tool

for analysis of data.

With this tool, we can accomplish typical steps in the processing and analysis of data, regardless of the origin of data — load, prepare, manipulate, model, and analyze

21
New cards

Supervised Machine Learning: Classification

Output one of a set of discrete labels. E.g., yes/no, low/medium/high, etc.

22
New cards

Supervised Machine Learning: Regression

Output a real number. E.g., number between (-inf,+inf), (0,+inf), (0.0,1.0), etc.

23
New cards

Supervised Machine Learning: Ranking

Output a ranking, either ordinal (0.0,1.0), or pairwise.

24
New cards

ML Input: Feature Vectors

A feature vector is simply a vector (represented as an array) of features

describing each example. E.g., salary & education, debt, etc.

Notation:

1) x is input vector, x i is the ith feature

2) x i is input vector, and x ij is the jth feature of the ith input vector

25
New cards

ML Input: Label

A correct label (desired output) associated with input feature vector.

Notation:

1) if x is input vector, y is output label

2) x i is input vector, and y i is output label

26
New cards

ML Output: Hypothesis

A hypothesis is a function that takes an input feature vector and outputs a

(predicted) label.

Notation:

1) h(x) a function

2) H is the set of all hypothesis functions being considered, called the

hypothesis space

27
New cards

Empirical Risk Minimization

A principle in machine learning where the idea is to minimize the error or "risk" on the training data.

Here's how it works:

Training Data: You have a dataset that you use to "train" or teach your machine learning model.

Risk or Error: You need a way to measure how well your model is doing. This is where the "risk" or error comes in. It's a way of quantifying how far off your model's predictions are from the true values.

Minimization: The goal of ERM is to find the best model that minimizes this error or risk on the training data.

28
New cards

DataFrames

While data can take many forms, tabular data (aka a table) is the most common.

In this setting, data is structured into rows representing a single entry

Properties or attributes of these entries are values in the row, and titled columns describe the properties.

Index — a unique identifier for each data entry

29
New cards

Probability

Compares the number of successes to the total number of attempts made.

30
New cards

Odds

Compares the number of successes to the number of failures.

31
New cards

Loss Function

The loss function measures the difference between the output of the hypothesis, h(x), and the desired output y. You can think of it as measuring the error in the hypothesis.

For now, we'll just consider the simplest loss function, the number of mistakes. If we're correct, then the loss is 0If we're incorrect, then the loss is 1This is also often called 0/1 loss (for obvious reasons, :>!)

32
New cards

Sample Space

A sample space in machine learning is the set of all possible inputs to a machine learning model. It is analogous to the set of all possible outcomes of a random experiment. For example, if we are building a machine learning model to classify images of cats and dogs, our sample space would be the set of all possible images of cats and dogs.

33
New cards

Accuracy

The proportion of correct predictions.

Accuracy is the most common metric for evaluating machine learning models. It is calculated as the percentage of all predictions that are correct:

(TP + TN)/(TP+FP+TN+FN)

34
New cards

Precision

The proportion of positive predictions that are actually correct- Recall: The proportion of actual positive examples that are correct

Measures how precise the model's positive predictions are. It is calculated as the percentage of positive predictions that are actually correct:

Precision = TP / (TP + FP)

35
New cards

Recall

The proportion of actual positive examples that are correct.

Recall measures how well the model identifies all of the instances of a particular class. It is calculated as the percentage of actual positive instances that are correctly predicted:

Recall = TP / (TP + FN)

36
New cards

True Positive (TP)

The model correctly predicts that an instance belongs to a particular class.

37
New cards

True Negative (TN)

The model correctly predicts that an instance does not belong to a particular class.

38
New cards

False Positive (FP)

The model incorrectly predicts that an instance belongs to a particular class when it does not.

39
New cards

False Negative (FN)

The model incorrectly predicts that an instance does not belong to a particular class when it does.

40
New cards

Joint Distribution

Describes the behavior of two or more random variables simultaneously.

41
New cards

Event

Is a subset of outcomes that can occur in an experiment.

Ex: The probability of getting an even number

42
New cards

Conditional Probability

The probability of an event occurring given that another event has already occurred. It quantifies the likelihood of an event. X happening under the condition that event Y has taken place.

43
New cards

The Product Rule

Given conditional probability, we can get joint distribution by simply multiplying:

P(X,Y) = P(X|Y)*P(Y)

44
New cards

Bayes Rule

Bayes' theorem is a mathematical formula that describes the probability of an event occurring, given the knowledge of whether or not another event has occurred. It is based on the idea that the probability of an event can be updated based on new information.

P(A|B) * P(A)/ P(B)

45
New cards

Permutation

An arrangement, or listing, of objects in which order is important.

n!/(n-k)!

46
New cards

Combinations

Order doesn't matter

n!/(n-k)! * k!

47
New cards

Independence

Two events are said to be independent if the occurrence of one event does not affect the probability of the occurrence of the other event.

Ex:

-Tossing a coin and rolling a die

-Tossing a coin today and tossing a coin tomorrow.

If two events are independent they satisfy P(A|B) = P(A)

and

P(B|A) = P(B)

48
New cards

Joint Probability for Two Independent Events

P(A,B) = P(A) * P(B)

49
New cards

Data Cleaning

Data cleaning is the process of identifying, deleting, and/or replacing

inconsistent or incorrect information from the data.