Computational Drug Design Lecture Notes

Computational Drug Design

Introduction

  • This lecture continues the discussion on computational drug design.
  • Previous lectures covered peptide-based drugs and drugs from natural sources.
  • A newer approach involves designing drugs from scratch, rather than finding them in nature.
  • The lecture will explore the progress in computational drug design, including AI-generated drugs.

Rational Design of Therapeutics

  • The key question is whether we can rationally design new therapeutics for any molecular target.
  • The goal is to develop computational methods applicable to various protein targets and disease indications.
  • Designing drugs often involves creating a molecule that binds to a specific site on a protein target, preventing its interaction with other proteins.
  • This is similar to the "Lock and Key" mechanism.

Binding Affinity

  • High affinity is achieved through shape complementarity and chemical complementarity.
  • Shape complementarity involves designing a molecule (key) that fits perfectly into the protein target (lock).
  • Chemical complementarity involves matching the chemical properties (e.g., charges) of the molecule and the target.
  • Example: If the protein target has positive charges, the designed peptide should have negative charges.

Other Challenges

  • Target location: The molecule must reach the target site, even if it's inside the cell or in the brain.
  • Drug delivery: Considerations include the molecule's ability to cross lipid membranes and the blood-brain barrier.
  • Oral bioavailability: If administered orally, the molecule must survive the GI tract without being broken down by enzymes.

Key Requirements for Computational Drug Design Methods

  • Diverse shape sampling: Methods must sample peptides with many shapes to match different target shapes.
  • Diverse chemistries: Methods must design with various chemistries to complement different targets.
  • Adaptability: Methods should adapt to the target and design molecules specifically for it.

Traditional Physics-Based Methods

  • Computational design includes AI-based methods and traditional physics-based methods.
  • Traditional methods form the foundation for understanding computational peptide design.

Key Concepts

  • Things tend to fold into their lowest energy state.
  • The amino acid sequence determines the protein's structure, which dictates its function.
  • Sequence \rightarrow Structure \rightarrow Function

Amino Acids

  • Proteins are made from 20 amino acids, each with a unique side chain.
  • These side chains have different charges and chemical properties.
  • The sequence of amino acids dictates the protein's folding due to the preferences of each amino acid.
  • Positive charges and negative charges typically want to be outside interacting with water, while hydrophobic amino acids want to go in.

Nobel Prize

  • The Nobel Prize in Chemistry recognized computational protein design and protein structure prediction.
  • The prize was awarded to DeepMind for AI-based methods for structure prediction and to David Baker for protein design.
Protein Structure Prediction
  • Given a sequence of amino acids, predict its 3D structure.
Protein Design
  • Design an amino acid sequence that folds into a desired structure.
  • The target structure must be the lowest energy confirmation for the sequence.

Process

  • Sample all possible shapes a peptide can fold into.
  • Calculate the energy of each state.
  • Find the lowest energy confirmation.
  • Design is the inverse of predicting what amino acid would have the lowest energy state in that structure.

Methods

  • Software programs: Rapidly sample protein shapes and calculate energy.
  • Foldit video game: Players manipulate protein chains to achieve the highest score (lowest energy).
  • Distributed computing: Volunteers donate compute time to run simulations.

Energy Calculation

  • Energy functions or score functions estimate energy using equations.
  • Total energy is divided into smaller energy terms.

Rosetta Fragment Assembly

  • This is a method for designing proteins with a known desired structure.
  • Instead of sampling one degree at a time, pre-assembled fragments of 3, 6, or 9 amino acids long are combined.
  • The core idea is that it's very hard to do sample one degree at a time and assemble bigger pieces instead.

Success

  • The magenta colored wiggly lines that you see are essentially the design model on the computer using Rosetta's method.
  • When made in the lab and then solving the structure, that is what the gray color indicates.
  • Comparison shows that the structure is extremely accurate with one angstrom being one tenth of a nanometer, and the accuracy here is that each atom is accurate to within one tenth of a nanometer.

Noncanonical Amino Acids

  • Canonical Amino acids form half of it.
  • The Other half are noncanonical amino acids, or specifically deamino acids.
  • The Gut enzymes have evolved to not use deamino acids.
  • In principle, drugs made with deamino acids should have longer half-lives inside the body.

Oral Bioavailability

  • Design proteins that can be orally delivered, get inside cells, or cross the blood-brain barrier.

Challenges

  • Lipid membranes: Proteins and peptides have polar, water-loving amino acids that struggle to cross lipid membranes.
  • Peptides hold on to water molecules, resisting movement through lipids.

Solutions

  • Fold proteins into shapes that don't interact with water.
  • Design peptides with multiple energy minima: one outside the membrane and another inside the lipids.

AI-Based Methods

  • AI methods have transformed the field, often outperforming physics-based methods.
  • The introduction of AlphaFold by Google DeepMind has been a significant advancement.

Difference between AI and Physics-Based Methods

  • In physics-based methods, researchers try to understand the biophysics and biology behind how proteins fold.
  • For AI based proteins, they show the software a lot of proteins to learn what folds correctly, and what doesn't. This removes all of the calculations related to Thermodynamics.

Examples and Learning

  • The AI is not telling you why a protein is good, it's just telling you it is good.

Structure Prediction with AI

  • Tools like AlphaFold and RosettaFold can quickly predict protein structures from sequences.
  • The process is much faster and more accurate than traditional methods.

Inversion of Prediction

  • AI networks can be inverted to generate new molecules with desired properties.
  • By tweaking sequences, one can optimize properties like folding confidence.

Denoising Diffusion

  • This method starts with random atoms and iteratively denoises them to generate a protein that interacts with a target.
Process
  • A picture is added to noise and the software learns how to denoise it.
  • When millions of training examples are added to the software, it can take random noise and create a brand new three-dimensional design.
  • Instead of adding noise to the picture, you are adding noise to the three-dimensional locations.

Adaptations

  • The first generation of models broke down at amino acid level, not atom level.
  • The newer generation are down to the atom level, where more control is possible.

Small Molecule Discovery

Traditional approaches

  • With billions of small molecules on hand, each one had to be docked in an attempt to find the best ones.

Improved through Small AI Networks

  • Docking is optimized with smaller networks that are trained to find specific types of chains to expedite what the protein target likes.

Co-Folding method

  • You predict the structure of the protein along with the fold of the target
  • An example of AlpaFoldThree is given.

Other Applications of AI

  • Discovering new targets for diseases.
  • Predicting molecule toxicity and permeability.
  • Identifying markers for specific disease indications.
  • Coscientists: AI tools that assist in literature review and hypothesis building.