BMM Module 12: Molecular Docking Simulations

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/18

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

19 Terms

1
New cards

Docking simulations

Simulation of binding between small molecule and a protein, where the output is a set of protein-molecule complexes with assigned scores

2
New cards

Docking simulations differ by

Scoring function

  • Force field based

  • Emperical

  • Knowledge based

Algorithm

  • Deterministic = predefined procedure with reproducible outcome

    • pros: fast and reproducible outcom

    • cons: may miss solutions

    • examples

      • Brute force

      • Shape fitting

      • incremental construction

  • Stochastic: includes randomness so different experiments can give different results

    • pros: better exploration of search space

    • examples

      • genetic docking

      • Monte Carlo

      • Tabu list search

3
New cards

Small molecule docking:
1 protein vs 1 ligand (class)

To understand how a ligand binds

Pose = different ways for a ligand to bind to a pocket

  1. method works best when you use several similar ligands whose binding is known

  2. Docking all creates many poses that can be clustered to identify common binding modes

  3. Analyze

    1. is there a correlation between the score and the experimental affinity

    2. Do differences in bindingmodes explain differences in affinity

4
New cards

Small molecule docking:

1 protein vs many different ligands

For drug discovery

virtual screen of 1k to 10M compounds to identify which will bind and select those with the highest predicted affinity

Challenges:

  1. as the amount of ligands increases, the score distribution widens and the top scores aren’t always active and you might miss hits
    Solution: Consensus scoring = combine multiple scoring functions and select those that score well in all of them

  2. Different ligands might prefer different poses and it might be hard to select the best ones

    1. some scoring function might give high score to pose that is not biologically relevent

    2. Solution: combine docking with other methods (MIFs, Pharmacophore modelling, similarity to other compounds)

5
New cards

Incremental construction algorithm

Deterministic: fast and reproducible

  1. fragment the ligand

  2. choose core fragment (largest/most rigid)

  3. Place core fragment in pocket and try all different orientations using shape complementarity

  4. Incremently (1 by 1) add remaining fragment back to their original positions trying different torsions around the bonds

  5. at each step only conformations that fit and avoid steric clashes are kept —> tree diagram

  6. when fully reassembled: you have a set of ligand poses that can be scored

<p>Deterministic: fast and reproducible</p><ol><li><p>fragment the ligand</p></li><li><p>choose core fragment (largest/most rigid)</p></li><li><p>Place core fragment in pocket and try all different orientations using shape complementarity</p></li><li><p>Incremently (1 by 1) add remaining fragment back to their original positions trying different torsions around the bonds</p></li><li><p>at each step only conformations that fit and avoid steric clashes are kept —&gt; tree diagram</p></li><li><p>when fully reassembled: you have a set of ligand poses that can be scored</p></li></ol><p></p>
6
New cards

Genetic docking algorithm

Stochastic: evolutionary principles with random variations

Treat poses as a population of organisms that evolve over time

Each pose (1 solution) is encoded as a chromosome containing genes that encode information like atom coordinates, orientation, torsion angles

  1. Start with a random population of chromosomes (diverse ligand poses)

  2. Evaluation: score each pose

    1. bad ones die

    2. good ones survive

    3. all poses where ligand binds outside of active site die, only the ones that bind in active site survive and reproduce

  3. Reproduction: surviving chromosomes can reproduce via:

    1. combining parts of 2 good solutions

    2. randomly changing genes = local search

  4. Repeat evaluation and reprodcution for many generations

  5. Stop when:

    1. top solutions don’t change significantly (RMS difference)

    2. preset number of generations is reached

7
New cards

Similarities and differences Monte Carlo and Simulated annealing

Similarities

  • Both are stochastic methods (use random sampling to explore the search space)

  • Both used in Sampling of the PES to find low energy protein folds and in docking to find low energy protein ligand poses

  • Iterative processes: both generate and evaluate multiple candidate solutions over time

Differences

  • Goal

    • GA: find the best possible solution (optimization)

      • output = best found solution

    • MC: random sampling

      • output = distribution of potential outcomes

  • Approach:

    • Ga works on populations of solutions

    • MC works on a single solution at a time

  • Mechanism

    • GA: solutions can share information via crossover

    • MC: No information exchange between samples

  • Evaluation

    • GA: uses score to compare and rank multiple solutions within a population (determines which survive and reproduce)

      • MC: uses scoring function to probabilistically accept or reject a move relative to the current state (not to rank solutions)

8
New cards

other algorithms

stochastic

  • monte carlo

  • tabu list search

deterministic

  • brute force

    • shape fitting

9
New cards

Tabu list search

  1. several initial ligand placements are generated randomly near the bindning site

  2. The molecule is moved by small changes

  3. Each new position is scored

  4. moves or positions that appear in the Tabu List are not allowed

  5. move selection

    1. better-scoring moves are preferred

    2. worse moves may be chosen if no better move is available

  6. memory update

    1. the selected move is added to the tabu list

    2. so they’re temporarily forbidden to prevent revisiting them

  7. the process repeats until convergence

10
New cards

Brute force

try every possible orientation/postion of the ligands

11
New cards

shape fitting

  1. generate a low energy conformation of the ligand

  2. fit the conformation in the pocket based geometric constraints

  3. optimize and score the fit

often used for fast screening

12
New cards

Scoring function

Biophyisical formulas that describe the quality of a docked molecule to its receptor

  • quality score during docking to guide algorithm towards better poses

    • quantitative score to rank docked molecules according to binding strength after docking

13
New cards

Force Field based scoring function

Only take non bonded interactions between protein and ligand into account (gives energetic values)

cons:

  • no entropic contribution

  • no inclusion of water models

    • water can mediate key ligand-receptor interactions (H-bonds)

  • FF parameters are hard to parameterize for a specific target

    • FF rely on param like partiel charges, atom types, … which are general and not tailored for a specific protein so the FF may not describe interactions accurately for a particular target

14
New cards

Emperical scoring function

= estimates protein-ligand binding by summing weighted interaction terms (H-bonds, hydrophobic,…) where the weights are obtained by fitting experimental binding data of known protein-ligand complexes using regressionmethods (=adjusting weights to match exp data)

Free energy terms

  • \Delta Gx =contains all unknown contributions learned through regression

  • polar interactions: H-bonds and ionic

  • apolar interactions: aromatic and lipophylic

  • entropic effects:

    • desolvation effects (removal of water upon binding)

    • loss of ligand flexibility: as ligand binds, the number of rotable bonds decreases ~Nrot

cons

  • need a training set

  • scores are only good if problem resembles training set

Weights=\sum f\left(\Delta R,\Delta\alpha\right)

\Delta R,\Delta\alpha —> The more you deviate from the ideal angle/distance the worse the score will be and how much worse is determined by the weights

<p>= estimates protein-ligand binding by summing weighted interaction terms (H-bonds, hydrophobic,…) where the weights are obtained by fitting experimental binding data of known protein-ligand complexes using regressionmethods (=adjusting weights to match exp data)</p><p>Free energy terms</p><ul><li><p> $$\Delta Gx$$ =contains all unknown contributions learned through regression </p></li></ul><ul><li><p>polar interactions: H-bonds and ionic</p></li><li><p>apolar interactions: aromatic and lipophylic</p></li><li><p>entropic effects: </p><ul><li><p>desolvation effects (removal of water upon binding)</p></li><li><p>loss of ligand flexibility: as ligand binds, the number of rotable bonds decreases ~N<sub>rot</sub> </p></li></ul></li></ul><p>cons</p><ul><li><p>need a training set</p></li><li><p>scores are only good if problem resembles training set</p></li></ul><p>$$Weights=\sum f\left(\Delta R,\Delta\alpha\right)$$ </p><p>$$\Delta R,\Delta\alpha$$ —&gt; The more you deviate from the ideal angle/distance the worse the score will be and how much worse is determined by the weights</p><p></p>
15
New cards

Knowledge based

uses statistics from known crystal structures to determine how favorable an interaction is

  1. Get stratistics from protein data bank: how often does atomtype i interact with atomtype j at a certain distance

  2. build histograms of distance vs frequency of interaction

  3. convert frequency to energy (more common interactions = lower energy)

  4. apply a scaling factor: some atomtypes occur more in crystalstructure than others

Resulting score ranks the quality of the predicted score (not true energies)

Rosetta score is a scoring function for folding and design and is a knowledge based scoring function

16
New cards

Challenges for scoring functions

  1. water

    • Water kan form upto 4 H-bonds and a lot of protein ligand interactions are mediated by water

    • but this is hard to model

    • In general, important water form >2 H-bonds

  2. Induced fit

    • Most docking software keep protein rigid while flexible ligands are docked, but in reality the protein can change conformation as a ligand binds = induced fit

    • Solution:

      1. Allow rotation of polar H’s or alternative rotamers (= alternative side chain conformations)

      2. Energy mininmize after docking

      3. run MD for realistic flexibility

  3. Over and underscoring

    • scoring functions are additive so more interactions means a higher score

    • Big molecules in general have higher scores because they have more contactpoints

    • So it’s hard to compare molecules of different sizes

  4. Docking produces many possible poses

    • false positives= bad pose with high score

    • false negatives = good pose with low score

    • solution: consensus scoring: combine multiple scoring functions

  5. solvation and desolvation is not taken into account

17
New cards

Advanced fix to scoring function challenges

Use MM-PBSA after scoring:

  1. inlcudes solvent

  2. accounts for some induce fit through MD simulations

  3. less dependent on the amount of ligands in the ligand (less overscoring)

  4. inludes entropy - desolvation effects (energetic consequences for removing water from both the ligand and receptor when they bind)

18
New cards

MM-PBSA: concept

= post processing method use to estimate the binding free energy of a ligand to a receptor more accurately than simple scoring functions

  • MM = molecular mechanics: calculates internal energies of ligand receptor and complex

    • bonded+non-bonded terms

  • PBSA = poisson boltzmann surface area
    estimates 2 solvation effects

    • polar solvation = electrostact stabilization by water using the poisson boltzmann formula

    • apol solvation = energetic penalty of creating a cavity in water (related to solvent accessible surface area)

  • the binding free energy is computed

<p>= post processing method use to estimate the binding free energy of a ligand to a receptor more accurately than simple scoring functions</p><ul><li><p>MM = molecular mechanics: calculates internal energies of ligand receptor and complex </p><ul><li><p>bonded+non-bonded terms</p></li></ul></li><li><p>PBSA = poisson boltzmann surface area<br>estimates 2 solvation effects</p><ul><li><p>polar solvation = electrostact stabilization by water using the poisson boltzmann formula</p></li><li><p>apol solvation = energetic penalty of creating a cavity in water (related to solvent accessible surface area)</p></li></ul></li><li><p>the binding free energy is computed</p></li></ul><p></p>
19
New cards