Computational Drug Design Lecture Notes

Computational Drug Design

Introduction

This lecture continues the discussion on computational drug design.
Previous lectures covered peptide-based drugs and drugs from natural sources.
A newer approach involves designing drugs from scratch, rather than finding them in nature.
The lecture will explore the progress in computational drug design, including AI-generated drugs.

Rational Design of Therapeutics

The key question is whether we can rationally design new therapeutics for any molecular target.
The goal is to develop computational methods applicable to various protein targets and disease indications.
Designing drugs often involves creating a molecule that binds to a specific site on a protein target, preventing its interaction with other proteins.
This is similar to the "Lock and Key" mechanism.

Binding Affinity

High affinity is achieved through shape complementarity and chemical complementarity.
Shape complementarity involves designing a molecule (key) that fits perfectly into the protein target (lock).
Chemical complementarity involves matching the chemical properties (e.g., charges) of the molecule and the target.
Example: If the protein target has positive charges, the designed peptide should have negative charges.

Other Challenges

Target location: The molecule must reach the target site, even if it's inside the cell or in the brain.
Drug delivery: Considerations include the molecule's ability to cross lipid membranes and the blood-brain barrier.
Oral bioavailability: If administered orally, the molecule must survive the GI tract without being broken down by enzymes.

Key Requirements for Computational Drug Design Methods

Diverse shape sampling: Methods must sample peptides with many shapes to match different target shapes.
Diverse chemistries: Methods must design with various chemistries to complement different targets.
Adaptability: Methods should adapt to the target and design molecules specifically for it.

Traditional Physics-Based Methods

Computational design includes AI-based methods and traditional physics-based methods.
Traditional methods form the foundation for understanding computational peptide design.

Key Concepts

Things tend to fold into their lowest energy state.
The amino acid sequence determines the protein's structure, which dictates its function.
$Sequence \rightarrow Structure \rightarrow Function$

Amino Acids

Proteins are made from 20 amino acids, each with a unique side chain.
These side chains have different charges and chemical properties.
The sequence of amino acids dictates the protein's folding due to the preferences of each amino acid.
Positive charges and negative charges typically want to be outside interacting with water, while hydrophobic amino acids want to go in.

Nobel Prize

The Nobel Prize in Chemistry recognized computational protein design and protein structure prediction.
The prize was awarded to DeepMind for AI-based methods for structure prediction and to David Baker for protein design.

Protein Structure Prediction

Given a sequence of amino acids, predict its 3D structure.

Protein Design

Design an amino acid sequence that folds into a desired structure.
The target structure must be the lowest energy confirmation for the sequence.

Process

Sample all possible shapes a peptide can fold into.
Calculate the energy of each state.
Find the lowest energy confirmation.
Design is the inverse of predicting what amino acid would have the lowest energy state in that structure.

Methods

Software programs: Rapidly sample protein shapes and calculate energy.
Foldit video game: Players manipulate protein chains to achieve the highest score (lowest energy).
Distributed computing: Volunteers donate compute time to run simulations.

Energy Calculation

Energy functions or score functions estimate energy using equations.
Total energy is divided into smaller energy terms.

Rosetta Fragment Assembly

This is a method for designing proteins with a known desired structure.
Instead of sampling one degree at a time, pre-assembled fragments of 3, 6, or 9 amino acids long are combined.
The core idea is that it's very hard to do sample one degree at a time and assemble bigger pieces instead.

Success

The magenta colored wiggly lines that you see are essentially the design model on the computer using Rosetta's method.
When made in the lab and then solving the structure, that is what the gray color indicates.
Comparison shows that the structure is extremely accurate with one angstrom being one tenth of a nanometer, and the accuracy here is that each atom is accurate to within one tenth of a nanometer.

Noncanonical Amino Acids

Canonical Amino acids form half of it.
The Other half are noncanonical amino acids, or specifically deamino acids.
The Gut enzymes have evolved to not use deamino acids.
In principle, drugs made with deamino acids should have longer half-lives inside the body.

Oral Bioavailability

Design proteins that can be orally delivered, get inside cells, or cross the blood-brain barrier.

Challenges

Lipid membranes: Proteins and peptides have polar, water-loving amino acids that struggle to cross lipid membranes.
Peptides hold on to water molecules, resisting movement through lipids.

Solutions

Fold proteins into shapes that don't interact with water.
Design peptides with multiple energy minima: one outside the membrane and another inside the lipids.

AI-Based Methods

AI methods have transformed the field, often outperforming physics-based methods.
The introduction of AlphaFold by Google DeepMind has been a significant advancement.

Difference between AI and Physics-Based Methods

In physics-based methods, researchers try to understand the biophysics and biology behind how proteins fold.
For AI based proteins, they show the software a lot of proteins to learn what folds correctly, and what doesn't. This removes all of the calculations related to Thermodynamics.

Examples and Learning

The AI is not telling you why a protein is good, it's just telling you it is good.

Structure Prediction with AI

Tools like AlphaFold and RosettaFold can quickly predict protein structures from sequences.
The process is much faster and more accurate than traditional methods.

Inversion of Prediction

AI networks can be inverted to generate new molecules with desired properties.
By tweaking sequences, one can optimize properties like folding confidence.

Denoising Diffusion

This method starts with random atoms and iteratively denoises them to generate a protein that interacts with a target.

Process

A picture is added to noise and the software learns how to denoise it.
When millions of training examples are added to the software, it can take random noise and create a brand new three-dimensional design.
Instead of adding noise to the picture, you are adding noise to the three-dimensional locations.

Adaptations

The first generation of models broke down at amino acid level, not atom level.
The newer generation are down to the atom level, where more control is possible.

Small Molecule Discovery

Traditional approaches

With billions of small molecules on hand, each one had to be docked in an attempt to find the best ones.

Improved through Small AI Networks

Docking is optimized with smaller networks that are trained to find specific types of chains to expedite what the protein target likes.

Co-Folding method

You predict the structure of the protein along with the fold of the target
An example of AlpaFoldThree is given.

Other Applications of AI

Discovering new targets for diseases.
Predicting molecule toxicity and permeability.
Identifying markers for specific disease indications.
Coscientists: AI tools that assist in literature review and hypothesis building.