Computational Drug Design Lecture Notes
Computational Drug Design
Introduction
- This lecture continues the discussion on computational drug design.
- Previous lectures covered peptide-based drugs and drugs from natural sources.
- A newer approach involves designing drugs from scratch, rather than finding them in nature.
- The lecture will explore the progress in computational drug design, including AI-generated drugs.
Rational Design of Therapeutics
- The key question is whether we can rationally design new therapeutics for any molecular target.
- The goal is to develop computational methods applicable to various protein targets and disease indications.
- Designing drugs often involves creating a molecule that binds to a specific site on a protein target, preventing its interaction with other proteins.
- This is similar to the "Lock and Key" mechanism.
Binding Affinity
- High affinity is achieved through shape complementarity and chemical complementarity.
- Shape complementarity involves designing a molecule (key) that fits perfectly into the protein target (lock).
- Chemical complementarity involves matching the chemical properties (e.g., charges) of the molecule and the target.
- Example: If the protein target has positive charges, the designed peptide should have negative charges.
Other Challenges
- Target location: The molecule must reach the target site, even if it's inside the cell or in the brain.
- Drug delivery: Considerations include the molecule's ability to cross lipid membranes and the blood-brain barrier.
- Oral bioavailability: If administered orally, the molecule must survive the GI tract without being broken down by enzymes.
Key Requirements for Computational Drug Design Methods
- Diverse shape sampling: Methods must sample peptides with many shapes to match different target shapes.
- Diverse chemistries: Methods must design with various chemistries to complement different targets.
- Adaptability: Methods should adapt to the target and design molecules specifically for it.
Traditional Physics-Based Methods
- Computational design includes AI-based methods and traditional physics-based methods.
- Traditional methods form the foundation for understanding computational peptide design.
Key Concepts
- Things tend to fold into their lowest energy state.
- The amino acid sequence determines the protein's structure, which dictates its function.
- Sequence \rightarrow Structure \rightarrow Function
Amino Acids
- Proteins are made from 20 amino acids, each with a unique side chain.
- These side chains have different charges and chemical properties.
- The sequence of amino acids dictates the protein's folding due to the preferences of each amino acid.
- Positive charges and negative charges typically want to be outside interacting with water, while hydrophobic amino acids want to go in.
Nobel Prize
- The Nobel Prize in Chemistry recognized computational protein design and protein structure prediction.
- The prize was awarded to DeepMind for AI-based methods for structure prediction and to David Baker for protein design.
Protein Structure Prediction
- Given a sequence of amino acids, predict its 3D structure.
Protein Design
- Design an amino acid sequence that folds into a desired structure.
- The target structure must be the lowest energy confirmation for the sequence.
Process
- Sample all possible shapes a peptide can fold into.
- Calculate the energy of each state.
- Find the lowest energy confirmation.
- Design is the inverse of predicting what amino acid would have the lowest energy state in that structure.
Methods
- Software programs: Rapidly sample protein shapes and calculate energy.
- Foldit video game: Players manipulate protein chains to achieve the highest score (lowest energy).
- Distributed computing: Volunteers donate compute time to run simulations.
Energy Calculation
- Energy functions or score functions estimate energy using equations.
- Total energy is divided into smaller energy terms.
Rosetta Fragment Assembly
- This is a method for designing proteins with a known desired structure.
- Instead of sampling one degree at a time, pre-assembled fragments of 3, 6, or 9 amino acids long are combined.
- The core idea is that it's very hard to do sample one degree at a time and assemble bigger pieces instead.
Success
- The magenta colored wiggly lines that you see are essentially the design model on the computer using Rosetta's method.
- When made in the lab and then solving the structure, that is what the gray color indicates.
- Comparison shows that the structure is extremely accurate with one angstrom being one tenth of a nanometer, and the accuracy here is that each atom is accurate to within one tenth of a nanometer.
Noncanonical Amino Acids
- Canonical Amino acids form half of it.
- The Other half are noncanonical amino acids, or specifically deamino acids.
- The Gut enzymes have evolved to not use deamino acids.
- In principle, drugs made with deamino acids should have longer half-lives inside the body.
Oral Bioavailability
- Design proteins that can be orally delivered, get inside cells, or cross the blood-brain barrier.
Challenges
- Lipid membranes: Proteins and peptides have polar, water-loving amino acids that struggle to cross lipid membranes.
- Peptides hold on to water molecules, resisting movement through lipids.
Solutions
- Fold proteins into shapes that don't interact with water.
- Design peptides with multiple energy minima: one outside the membrane and another inside the lipids.
AI-Based Methods
- AI methods have transformed the field, often outperforming physics-based methods.
- The introduction of AlphaFold by Google DeepMind has been a significant advancement.
Difference between AI and Physics-Based Methods
- In physics-based methods, researchers try to understand the biophysics and biology behind how proteins fold.
- For AI based proteins, they show the software a lot of proteins to learn what folds correctly, and what doesn't. This removes all of the calculations related to Thermodynamics.
Examples and Learning
- The AI is not telling you why a protein is good, it's just telling you it is good.
Structure Prediction with AI
- Tools like AlphaFold and RosettaFold can quickly predict protein structures from sequences.
- The process is much faster and more accurate than traditional methods.
Inversion of Prediction
- AI networks can be inverted to generate new molecules with desired properties.
- By tweaking sequences, one can optimize properties like folding confidence.
Denoising Diffusion
- This method starts with random atoms and iteratively denoises them to generate a protein that interacts with a target.
Process
- A picture is added to noise and the software learns how to denoise it.
- When millions of training examples are added to the software, it can take random noise and create a brand new three-dimensional design.
- Instead of adding noise to the picture, you are adding noise to the three-dimensional locations.
Adaptations
- The first generation of models broke down at amino acid level, not atom level.
- The newer generation are down to the atom level, where more control is possible.
Small Molecule Discovery
Traditional approaches
- With billions of small molecules on hand, each one had to be docked in an attempt to find the best ones.
Improved through Small AI Networks
- Docking is optimized with smaller networks that are trained to find specific types of chains to expedite what the protein target likes.
Co-Folding method
- You predict the structure of the protein along with the fold of the target
- An example of AlpaFoldThree is given.
Other Applications of AI
- Discovering new targets for diseases.
- Predicting molecule toxicity and permeability.
- Identifying markers for specific disease indications.
- Coscientists: AI tools that assist in literature review and hypothesis building.