Notes on Machine Learning

Supervised Learning

Supervised learning is the most common type of machine learning and involves learning A to B or input to output mappings.

Examples:

Email (input) -> Spam/Not Spam (output)
Audio clip (input) -> Text transcript (output)
English (input) -> Another language (output)
Ad information (input) -> Will user click? (output)
Image/Sensor data (input) -> Position of other cars (output)
Manufactured product image (input) -> Defects? (output)

Large Language Models (LLMs)

LLMs use supervised learning to predict the next word in a sequence. For instance, the sentence "My favorite drink is lychee bubble tea" becomes multiple data points: "My favorite drink" -> "is", "My favorite drink is" -> "lychee", and so on.

LLMs are trained on massive datasets (hundreds of billions or trillions of words) to generate text in response to a prompt.

Rise of Supervised Learning

Traditional AI systems show limited performance improvement with more data. Neural networks and deep learning, however, scale much better with data.

Small neural net: Performance improves with data.
Medium neural net: Better performance.
Large neural net: Continues to improve significantly.

To achieve high performance, you need a lot of data and the ability to train a large neural network.

Fast computers and specialized processors (GPUs) have enabled more companies to train large neural nets and derive business value.

Scaling data and model size has driven breakthroughs in generative AI systems, including LLMs.