Neural Networks and Classification

Introduction to Neural Networks

Chico will be covering neural networks for the week.
The focus is on understanding how neural networks work, starting with simple building blocks.
Later lectures will cover putting them together for applications like computer vision.

Classification is a task that predicts a discrete category, unlike regression, which predicts a continuous value.
Logistic regression is introduced as a classification technique.
The goal is to understand the perceptron, which is the basic building block of neural networks.

Train/test splits are essential when training algorithms.
The idea is to study a portion of the data (e.g., 80%) and test on the remaining data (e.g., 20%).
This approach ensures an accurate assessment of what the algorithm has learned.
Example: An algorithm receives coordinates (x, y) of data points and predicts whether they are red or blue, which is a binary classification task.

Regression: Predicts continuous values, such as house prices or ordinal values (first, second, third - which are categories with integer values).
Classification: Predicts categories or classes. For example:
- Is a face in a picture (yes/no)?
- Multi-class: Given a photo, is it of me, my dad, or my mom?
- Predicting temperature as hot or cold.
Both classification and regression are supervised learning tasks where there is a right answer.

Spam classifiers in email inboxes are trained on input data with labels (spam/not spam).
Surprisingly, a simple algorithm that looks at words in an email (ignoring order and grammar) can be quite effective.
Classification can be used with other data; for example, predicting house prices (regression) versus predicting the city where a house is located (classification).

Classifiers can output probabilities indicating the likelihood of each class.
Example: A dog breed classifier might output probabilities for different breeds.
For decision-making tasks (spam/not spam), only the class with the highest probability matters.
Probabilities can be useful to measure the certainty of the model.
Some classifiers output probabilities, while others do not.

Classifiers work only with the data they are given.
Example: Image-based classifiers may struggle with adversarial examples (e.g., differentiating between a chihuahua and a muffin).
These examples highlight the limitations of machine learning models.
The classifiers lack world knowledge and common sense that humans possess.
Classifiers trained on limited data or specific contexts may not generalize well.

Generalization is the ability to make accurate predictions outside the training data.
Large pre-trained language models (like GPT) are trained on billions of documents, giving them more general knowledge.
Models trained on narrow datasets may fail when applied out of context.
Example: A tank-detection model trained only on daytime images failed to detect tanks at night.
Generalization has limits; models specialized for one task may not perform well on others.

Supervised learning involves giving the model a task and a correct answer to learn from.
The term