vid_L2: Deep Learning Tools and Frameworks Flashcards

Tools of Deep Learning

Coding from scratch is possible but using frameworks is faster.
Keras:
- A high-level framework.
- Used for defining neural networks and setting up their architecture.
- Does not handle the actual computation.
TensorFlow (or PyTorch, etc.):
- Handles the computation.
- Provides multidimensional array operations (matrix multiplication).
NumPy:
- Basic number manipulation.

Framework Flexibility

You can choose any framework you like.
Lectures and exercises will primarily use Keras and TensorFlow.

Keras History

Initially built by a researcher at Google.
100% open source and community-driven.
Interconnected with TensorFlow; installing TensorFlow includes Keras.
TensorFlow is now somewhat separated from Google; Google supports JAX.
PyTorch is supported by Meta.

Practical Application

For projects or exercises, you can explore other frameworks.
They generally have a similar structure.

Python as the Primary Language

Python is the language for deep learning.
Basic Python knowledge is sufficient for most tasks.
Advanced tasks like custom network layers require more in-depth Python knowledge (e.g., subclassing).

Computation Backends

Python sets up the architecture and configuration of the computation graph.
Actual computation is done by the backend (C++ and CUDA for GPU).
Focus is on setting up models in Python.

Model Deployment

Models can be run in various languages.
Frameworks have community-driven repositories for different languages.
Training the model primarily happens in Python.

NumPy Importance

NumPy knowledge is helpful but not strictly required.
If you know NumPy, TensorFlow operations will be more obvious.

Setting Up a Virtual Environment

Use a virtual environment to avoid version conflicts.
Clone the code and install dependencies using pip.
Dependencies include TensorFlow and possibly others.
Towards the end of the course, an audio framework might be used for web pages.

Computing Resources

Hardware accelerators (GPUs) significantly benefit deep learning.
GPUs have many cores for parallelizing computations (matrix multiplication).
Gaming computers or recent Macs may have suitable GPUs.

Cloud Resources

Google Colaboratory (Colab) provides free cloud resources with a Google account.
Colab is the equivalent of Jupyter Notebooks.
Other cloud services are available, some for free or with academic resources.

Jupyter Notebooks

Content and exercises are provided as Jupyter Notebooks (.ipynb files).
Notebooks contain text and code.
First exercise involves classifying handwritten digits using deep learning.

Google Colab Demonstration

Colab allows running notebooks in the cloud.
You can select a runtime type, including GPUs.
A virtual machine is allocated for running the code.

TensorFlow Basics

TensorFlow provides extra features for speeding up computation (GPU support, just-in-time compilation).

Tensors

Tensors are multidimensional arrays, much like NumPy arrays. They are the fundamental data structure in TensorFlow, used to represent all data. Think of them as containers for numerical data, which can be scalars (single numbers), vectors (1D arrays), matrices (2D arrays), or higher-dimensional arrays. When you define an array in TensorFlow, you're essentially creating a tensor object. Each tensor includes information about its shape (the size of each dimension) and its data type (e.g., float32, int32).

Tensors are multidimensional arrays (similar to NumPy arrays).
Constants and variables are key tensor types.
Constants: immutable tensors.
Variables: mutable tensors that can be updated (used for training).
When defining an array, you get a tensor object by default.
- Includes information like shape and data type.

Data Types

TensorFlow requires considering data types.
Typically uses 32-bit floating-point numbers (float32).
Lower precision (16-bit, 8-bit) can be used to reduce memory usage, with ongoing research in quantization (even 4-bit).

Tensor Definition

Use tf.constant to define a constant tensor.
Use tf.Variable to define a variable tensor.

Variable Assignment

Use .assign() to change the value of a variable tensor at a specific position.

Operators

Basic math operators are available.
+ sums tensors element-wise.
* multiplies element-wise (not matrix multiplication).
tf.matmul() performs matrix multiplication.

Reduce Functions

Functions starting with reduce_ (e.g., reduce_sum, reduce_max) are used to reduce dimensions.
Specify the axis to reduce along (0 for rows, 1 for columns, None for all).

Example


matrix = [[1, 2, 3],
          [4, 5, 6]]

tf.reduce_sum(matrix, axis=0) sums along rows (resulting in [5, 7, 9]).
tf.reduce_sum(matrix, axis=1) sums along columns (resulting in [6, 15]).
tf.reduce_sum(matrix, axis=None) sums all elements (resulting in 21).

Shapes

Shapes define the dimensions of a tensor.
Shapes must match for operations.
TensorFlow can sometimes automatically expand dimensions to match (broadcasting).
Scalar (single number) has no shape.
Vector (list of elements) has a shape equal to the number of elements.
Table (matrix) has a shape of (rows, columns).
Higher dimensions are more complex to visualize.

Visualizing Higher Dimensions

Three-dimensional tensors can be visualized as stacked tables or cubes.
Deep learning often uses four-dimensional tensors.

Four-Dimensional Tensors

Typical dimensions for images:
- Batch Size
- Height
- Width
- Channels (e.g., 3 for RGB images)
The batch dimension is used for mini-batch gradient descent.
Mini-batch gradient descent helps with generalization and fitting data into GPU memory.

Shape Mismatches

Shape mismatches can occur when processing single images.
tf.expand_dims(tensor, axis=0) adds a batch dimension of 1.
tf.squeeze(tensor, axis=None) removes dimensions of size 1.

GPU Usage

TensorFlow 2 automatically connects to available GPUs.
Variables should be initialized on the GPU.
TensorFlow automatically handles moving data to the GPU for computation.

Automatic Differentiation

TensorFlow provides automatic differentiation for computing gradients.
This simplifies the process of gradient descent.

Example: Differentiating x^2 + 2x - 5

Want to differentiate the following equation
```
f(x) = x^2 + 2x - 5
```
Want to evaluate at the point x = 1 calculating the following steps.
```
f'(x) = 2x + 2
```
```
f'(1) = 2(1) + 2 = 4
```

Automatic Differentiation Steps

Define the function

def my_function(x):
    return x**2 + 2*x - 5

Compute the gradients.

x = tf.constant(1.0)
with tf.GradientTape() as tape:
    tape.watch(x) # Track the tensor
    y = my_function(x)
gradients = tape.gradient(y, x)
print(gradients) # Output the gradients

Keras Overview

Layers: Building blocks of neural networks.
Callbacks: Functions to execute during training (e.g., saving checkpoints).
Optimizers: Algorithms for gradient descent (tuning parameters).
Metrics: Functions to evaluate model performance.
Loss function: Quantifies the error during training.
Datasets: Small datasets for testing and benchmarking.
Applications: Pre-trained networks (primarily for computer vision).

Keras Applications

Pre-trained models can be easily loaded and used.
Example: Specifying a model name downloads and runs the model automatically.

Students providing feedback on the course.

Action Items

Review TensorFlow basics.
Copy and paste the provided code examples.
Prepare for the computer vision focus on Monday.