Advanced Topics in Computer Science: Advances in AI

Advanced Topics in Computer Science/Business Computing (CS3001/3606)

Advances in AI: Large Language Models (LLMs) for Question Answering (QA) & Text Generation

David Bell

News

Task

ONE

Task 1

Quizzes & Discussion

Introduction

Overview of Topics

Language Models
Large Language Models
Fine-Tuning Large Language Models
Prompt Engineering
Retrieval Augmented Generation
LLMs in Fintech & Healthcare
Evaluation

Concepts

Language Models: Statistical models predicting the next word in a sentence based on prior words.
Large Language Models (LLMs): Advanced language models with millions/billions of parameters capable of understanding and generating human-like text.
Embeddings: Continuous vector representations of words or phrases that capture semantic meaning.
Fine-tuning: Adaptation of pre-trained models to specific tasks by continued training on smaller, task-specific datasets.
Prompt Engineering: Designing inputs to effectively guide the response generation of LLMs.
Retrieval Augmented Generation (RAG): Technique that enhances LLM outputs by integrating external factual sources.
Zero-/Single-/n-shot Prompting: Approaches to prompt responses without examples (zero-shot), with one example (single-shot), and with multiple examples (n-shot).
Chaining: A technique that connects multiple prompts/results for complex tasks.
Transformers: Architecture for LLMs leveraging self-attention to process input data.
Attention: Focus mechanism within Transformers that enhances the processing of input sequence parts.
Quantization: Reducing a model's precision to lower resource usage while maintaining acceptable performance.
LLM Evaluation: Methods for assessing the effectiveness of LLM outputs, e.g., ROUGE, BLEU.
Vector Databases: Specialized databases optimized for storing and querying vector representations.
Distillation: A process of creating smaller models that retain the knowledge of a larger model.

Learning Outcomes

Explanation of terms: Large Language Model (LLM), Prompt Engineering, Fine-Tuning
Design LLM systems using advanced methods and RAG
Understand LLM concepts and make design choices
Assess LLM system outputs

GenAI and LLMs

Example Prompt

Task: Create a report on fine-tuning LLMs
Response from ChatGPT:
1. Introduction: LLMs revolutionize natural language processing. Fine-tuning enhances their performance on specific tasks.
2. Overview of Large Language Models:
- Definition and Characteristics: Capable of performing multiple NLU tasks such as translation and summarization.
- Pre-training Phase: Uses transfer learning with transformer architectures and self-attention mechanisms.
- Notable LLMs: GPT series, BERT, XLNet, T5.
1. Fine-Tuning LLMs: Overview of methodologies and techniques from literature.

What is a Language Model?

Definition: Probability model denoted as P(Text | Preceding Text)
Concept: Models the likelihood of a word following one or more preceding words based on statistical patterns in training data.

Conditional Probability Formula

Formula: $P(W1,W2,W3,W4, ext{…},Wn) = P(W1) imes P(W2 | W1) imes P(W3 | W1,W2) imes … imes P(Wn | W1,W2, ext{…},W(n-1))$
Exploration: Captures the sequence's implicit order through conditional probability.

Large Language Models

Examples of LLMs:
- OpenAI's GPT
- DeepMind's Chinchilla
- Bloomberg GPT
- Google's Med-Gemini
- Meta's LLAMA
- T5/Gemma
- Mistral

Characteristics of LLMs

Utilize transformer-based architectures.
Efficiently manage long-range dependencies by processing sequences of tokens.

Typical LLM Projects in Computer Science

Text-oriented tasks:
- Email writing
- Code generation
- Document analysis
Training and Fine-tuning:
- Utilization of datasets for specialized tasks
Replacing humans with chatbots: Performing repetitive text generation tasks.

Size of LLMs

Illustrations of LLM capabilities with performance metrics including human expert levels, measured via various benchmarks such as MMLU.
Key models and their sizes include:
- Gopher
- U-PALM
- GPT-4 Classic
- ChatGPT (GPT-3.5 turbo)
- Llama (65B parameters)
- Chinchilla and others.

Architectural Designs

Fine-tuning and Prompt Engineering:
- Techniques and frameworks for optimizing LLMs for specific applications.

Advanced Prompt Engineering Techniques

Retrieval Augmented Generation (RAG):
- Combines prompt engineering with external knowledge sources for more accurate LLM outputs.
- Components:
  - Prompt embedding
  - Query Data
  - Improved Prompt for the LLM

Example Data (from GPT-4)

Generate question-answer pairs related to financial risks.

Fine-Tuning vs Quantization

Fine-Tuning: Modifying a pre-trained LLM for specific tasks using domain-specific data. Typical cost for training models like GPT-4 is approximately $100 million.
Quantization: Reduces resource requirements by lowering precision levels (e.g., from float32 to float16 or int8).

Distillation

Process involving training smaller models to emulate larger models' behavior, thereby retaining knowledge while improving efficiency.
Key roles of Knowledge Distillation (KD) in LLMs:
1. Enhances capabilities.
2. Provides model compression.
3. Encourages self-improvement.

LLMs in Business

Applications in Fintech

Bloomberg GPT: A large language model tailored for finance with superior performance metrics compared to open models.
LLM-based Trade Document Analysis: Leveraging LLMs to ensure accuracy and reliability in trade documents.

Applications in Healthcare

Utilization of LLMs to manage and summarize scientific knowledge and translate complex medical content.

Evaluation Metrics

ROUGE

An evaluation metric measuring summary quality by overlap with reference texts, utilizing F1 score for N-gram precision and recall.
Recall and Precision Goals:
- Recall = accurate n-grams / reference total
- Precision = quality n-grams / output total

Application Example

Example: Model Output = 'the house on the hill', Reference = 'walk up the hill to the house'.
- 1-gram overlap evaluation.

Other Evaluation Metrics

BLEU, Perplexity, Human evaluation, etc.

Future Lectures

Expectations

Preparation for exam structure focusing on LLMs, their applications, and integration methods.
Discussion questions regarding RAG, output comparisons, and potential LLM applications in specific domains such as healthcare and finance.

Bibliography

Includes multiple referenced works discussing advancements in and applications of LLMs.