Docking glossary
Receptor or host or lock
The "receiving" molecule, most commonly a protein or other biopolymer.
Ligand or guest or key
The complementary partner molecule which binds to the receptor. Ligands are most often small molecules but could also be another biopolymer.
Docking
Computational simulation of a candidate ligand binding to a receptor.
Binding mode
The orientation of the ligand relative to the receptor as well as the conformation of the ligand and receptor when bound to each other.
Pose
A candidate binding mode.
Scoring
The process of evaluating a particular pose by counting the number of favorable intermolecular interactions such as hydrogen bonds and hydrophobic contacts.
Ranking
The process of classifying which ligands are most likely to interact favorably to a particular receptor based on the predicted free-energy of binding.
Docking assessment (DA)
Procedure to quantify the predictive capability of a docking protocol.
edit

In the fields of computational chemistry and molecular modelling, scoring functions are mathematical functions used to approximately predict the binding affinity between two molecules after they have been docked. Most commonly one of the molecules is a small organic compound such as a drug and the second is the drug's biological target such as a protein receptor.[1] Scoring functions have also been developed to predict the strength of intermolecular interactions between two proteins[2] or between protein and DNA.[3]

Utility

Scoring functions are widely used in drug discovery and other molecular modelling applications. These include:[4]

A potentially more reliable but much more computationally demanding alternative to scoring functions are free energy perturbation calculations.[8]

Prerequisites

Scoring functions are normally parameterized (or trained) against a data set consisting of experimentally determined binding affinities between molecular species similar to the species that one wishes to predict.

For currently used methods aiming to predict affinities of ligands for proteins the following must first be known or predicted:

The above information yields the three-dimensional structure of the complex. Based on this structure, the scoring function can then estimate the strength of the association between the two molecules in the complex using one of the methods outlined below. Finally the scoring function itself may be used to help predict both the binding mode and the active conformation of the small molecule in the complex, or alternatively a simpler and computationally faster function may be utilized within the docking run.

Classes

There are four general classes of scoring functions:[9][10][11]

The first three types, force-field, empirical and knowledge-based, are commonly referred to as classical scoring functions and are characterized by assuming their contributions to binding are linearly combined. Due to this constraint, classical scoring functions are unable to take advantage of large amounts of training data.[35]

Refinement

Since different scoring functions are relatively co-linear, consensus scoring functions may not improve accuracy significantly.[36] This claim went somewhat against the prevailing view in the field, since previous studies had suggested that consensus scoring was beneficial.[37]

A perfect scoring function would be able to predict the binding free energy between the ligand and its target. But in reality both the computational methods and the computational resources put restraints to this goal. So most often methods are selected that minimize the number of false positive and false negative ligands. In cases where an experimental training set of data of binding constants and structures are available a simple method has been developed to refine the scoring function used in molecular docking.[38]

References

  1. ^ Jain AN (October 2006). "Scoring functions for protein-ligand docking". Current Protein & Peptide Science. 7 (5): 407–20. doi:10.2174/138920306778559395. PMID 17073693.
  2. ^ Lensink MF, Méndez R, Wodak SJ (December 2007). "Docking and scoring protein complexes: CAPRI 3rd Edition". Proteins. 69 (4): 704–18. doi:10.1002/prot.21804. PMID 17918726. S2CID 25383642.
  3. ^ Robertson TA, Varani G (February 2007). "An all-atom, distance-dependent scoring function for the prediction of protein-DNA interactions from structure". Proteins. 66 (2): 359–74. doi:10.1002/prot.21162. PMID 17078093. S2CID 24437518.
  4. ^ Rajamani R, Good AC (May 2007). "Ranking poses in structure-based lead discovery and optimization: current trends in scoring function development". Current Opinion in Drug Discovery & Development. 10 (3): 308–15. PMID 17554857.
  5. ^ Seifert MH, Kraus J, Kramer B (May 2007). "Virtual high-throughput screening of molecular databases". Current Opinion in Drug Discovery & Development. 10 (3): 298–307. PMID 17554856.
  6. ^ a b Böhm HJ (July 1998). "Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs". Journal of Computer-Aided Molecular Design. 12 (4): 309–23. Bibcode:1998JCAMD..12..309B. doi:10.1023/A:1007999920146. PMID 9777490. S2CID 7474036.
  7. ^ Joseph-McCarthy D, Baber JC, Feyfant E, Thompson DC, Humblet C (May 2007). "Lead optimization via high-throughput molecular docking". Current Opinion in Drug Discovery & Development. 10 (3): 264–74. PMID 17554852.
  8. ^ Foloppe N, Hubbard R (2006). "Towards predictive ligand design with free-energy based computational methods?". Current Medicinal Chemistry. 13 (29): 3583–608. doi:10.2174/092986706779026165. PMID 17168725.
  9. ^ Fenu LA, Lewis RA, Good AC, Bodkin M, Essex JW (2007). "Chapter 9: Scoring Functions: From Free-energies of Binding to Enrichment in Virtual Screening". In Dhoti H, Leach AR (eds.). Structure-Based Drug Discovery. Dordrecht: Springer. pp. 223–246. ISBN 978-1-4020-4407-6.
  10. ^ Sotriffer C, Matter H (2011). "Chapter 7.3: Classes of Scoring Functions". In Sotriffer C (ed.). Virtual Screening: Principles, Challenges, and Practical Guidelines. Vol. 48. John Wiley & Sons, Inc. ISBN 978-3-527-63334-0.
  11. ^ a b c Ain QU, Aleksandrova A, Roessler FD, Ballester PJ (2015-11-01). "Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening". Wiley Interdisciplinary Reviews: Computational Molecular Science. 5 (6): 405–424. doi:10.1002/wcms.1225. PMC 4832270. PMID 27110292.
  12. ^ Genheden S, Ryde U (May 2015). "The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities". Expert Opinion on Drug Discovery. 10 (5): 449–61. doi:10.1517/17460441.2015.1032936. PMC 4487606. PMID 25835573.
  13. ^ Schneider N, Lange G, Hindle S, Klein R, Rarey M (January 2013). "A consistent description of HYdrogen bond and DEhydration energies in protein-ligand complexes: methods behind the HYDE scoring function". Journal of Computer-Aided Molecular Design. 27 (1): 15–29. Bibcode:2013JCAMD..27...15S. doi:10.1007/s10822-012-9626-2. PMID 23269578. S2CID 1545277.
  14. ^ Lange G, Lesuisse D, Deprez P, Schoot B, Loenze P, Bénard D, Marquette JP, Broto P, Sarubbi E, Mandine E (November 2003). "Requirements for specific binding of low affinity inhibitor fragments to the SH2 domain of (pp60)Src are identical to those for high affinity binding of full length inhibitors". Journal of Medicinal Chemistry. 46 (24): 5184–95. doi:10.1021/jm020970s. PMID 14613321.
  15. ^ Muegge I (October 2006). "PMF scoring revisited". Journal of Medicinal Chemistry. 49 (20): 5895–902. doi:10.1021/jm050038s. PMID 17004705.
  16. ^ Ballester PJ, Mitchell JB (May 2010). "A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking". Bioinformatics. 26 (9): 1169–75. doi:10.1093/bioinformatics/btq112. PMC 3524828. PMID 20236947.
  17. ^ Li H, Leung KS, Wong MH, Ballester PJ (February 2015). "Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets". Molecular Informatics. 34 (2–3): 115–26. doi:10.1002/minf.201400132. PMID 27490034. S2CID 3444365.
  18. ^ Ashtawy HM, Mahapatra NR (2015-04-01). "A Comparative Assessment of Predictive Accuracies of Conventional and Machine Learning Scoring Functions for Protein-Ligand Binding Affinity Prediction". IEEE/ACM Transactions on Computational Biology and Bioinformatics. 12 (2): 335–47. doi:10.1109/TCBB.2014.2351824. PMID 26357221.
  19. ^ Zhan W, Li D, Che J, Zhang L, Yang B, Hu Y, Liu T, Dong X (March 2014). "Integrating docking scores, interaction profiles and molecular descriptors to improve the accuracy of molecular docking: toward the discovery of novel Akt1 inhibitors". European Journal of Medicinal Chemistry. 75: 11–20. doi:10.1016/j.ejmech.2014.01.019. PMID 24508830.
  20. ^ Kinnings SL, Liu N, Tonge PJ, Jackson RM, Xie L, Bourne PE (February 2011). "A machine learning-based method to improve docking scoring functions and its application to drug repurposing". Journal of Chemical Information and Modeling. 51 (2): 408–19. doi:10.1021/ci100369f. PMC 3076728. PMID 21291174.
  21. ^ Li H, Sze K-H, Lu G, Ballester PJ (2020-02-05). "Machine-Learning Scoring Functions for Structure-Based Drug Lead Optimization". Wiley Interdisciplinary Reviews: Computational Molecular Science. 10 (5). doi:10.1002/wcms.1465.
  22. ^ Li L, Wang B, Meroueh SO (September 2011). "Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries". Journal of Chemical Information and Modeling. 51 (9): 2132–8. doi:10.1021/ci200078f. PMC 3209528. PMID 21728360.
  23. ^ Durrant JD, Friedman AJ, Rogers KE, McCammon JA (July 2013). "Comparing neural-network scoring functions and the state of the art: applications to common library screening". Journal of Chemical Information and Modeling. 53 (7): 1726–35. doi:10.1021/ci400042y. PMC 3735370. PMID 23734946.
  24. ^ Ding B, Wang J, Li N, Wang W (January 2013). "Characterization of small molecule binding. I. Accurate identification of strong inhibitors in virtual screening". Journal of Chemical Information and Modeling. 53 (1): 114–22. doi:10.1021/ci300508m. PMC 3584174. PMID 23259763.
  25. ^ Wójcikowski M, Ballester PJ, Siedlecki P (April 2017). "Performance of machine-learning scoring functions in structure-based virtual screening". Scientific Reports. 7: 46710. Bibcode:2017NatSR...746710W. doi:10.1038/srep46710. PMC 5404222. PMID 28440302.
  26. ^ Ragoza M, Hochuli J, Idrobo E, Sunseri J, Koes DR (April 2017). "Protein-Ligand Scoring with Convolutional Neural Networks". Journal of Chemical Information and Modeling. 57 (4): 942–957. arXiv:1612.02751. doi:10.1021/acs.jcim.6b00740. PMC 5479431. PMID 28368587.
  27. ^ Li H, Peng J, Leung Y, Leung KS, Wong MH, Lu G, Ballester PJ (March 2018). "The Impact of Protein Structure and Sequence Similarity on the Accuracy of Machine-Learning Scoring Functions for Binding Affinity Prediction". Biomolecules. 8 (1): 12. doi:10.3390/biom8010012. PMC 5871981. PMID 29538331.
  28. ^ Imrie F, Bradley AR, Deane CM (February 2021). "Generating Property-Matched Decoy Molecules Using Deep Learning". Bioinformatics. 37 (btab080): 2134–2141. doi:10.1093/bioinformatics/btab080. PMC 8352508. PMID 33532838.
  29. ^ Adeshina YO, Deeds EJ, Karanicolas J (August 2020). "Machine learning classification can reduce false positives in structure-based virtual screening". Proceedings of the National Academy of Sciences of the United States of America. 117 (31): 18477–18488. Bibcode:2020PNAS..11718477A. doi:10.1073/pnas.2000585117. PMC 7414157. PMID 32669436.
  30. ^ Xiong GL, Ye WL, Shen C, Lu AP, Hou TJ, Cao DS (June 2020). "Improving structure-based virtual screening performance via learning from scoring function components". Briefings in Bioinformatics. 22 (bbaa094). doi:10.1093/bib/bbaa094. PMID 32496540.
  31. ^ Shen C, Ding J, Wang Z, Cao D, Ding X, Hou T (2019-06-27). "From Machine Learning to Deep Learning: Advances in Scoring Functions for Protein–ligand Docking". Wiley Interdisciplinary Reviews: Computational Molecular Science. 10. doi:10.1002/wcms.1429. S2CID 198336898.
  32. ^ Yang X, Wang Y, Byrne R, Schneider G, Yang S (2019-07-11). "Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery". Chemical Reviews. 119 (18): 10520–10594. doi:10.1021/acs.chemrev.8b00728. PMID 31294972.
  33. ^ Li H, Sze K-H, Lu G, Ballester PJ (2020-04-22). "Machine-Learning Scoring Functions for Structure-Based Virtual Screening". Wiley Interdisciplinary Reviews: Computational Molecular Science. 11. doi:10.1002/wcms.1478. S2CID 219089637.
  34. ^ Ballester PJ (December 2019). "Selecting machine-learning scoring functions for structure-based virtual screening". Drug Discovery Today: Technologies. 32–33: 81–87. doi:10.1016/j.ddtec.2020.09.001. PMID 33386098. S2CID 224968364.
  35. ^ Li H, Peng J, Sidorov P, Leung Y, Leung KS, Wong MH, Lu G, Ballester PJ (March 2019). "Classical scoring functions for docking are unable to exploit large volumes of structural and interaction data". Bioinformatics. Oxford, England. 35 (20): 3989–3995. doi:10.1093/bioinformatics/btz183. PMID 30873528.
  36. ^ Englebienne P, Moitessier N (June 2009). "Docking ligands into flexible and solvated macromolecules. 4. Are popular scoring functions accurate for this class of proteins?". Journal of Chemical Information and Modeling. 49 (6): 1568–80. doi:10.1021/ci8004308. PMID 19445499.
  37. ^ Oda A, Tsuchida K, Takakura T, Yamaotsu N, Hirono S (2006). "Comparison of consensus scoring strategies for evaluating computational models of protein-ligand complexes". Journal of Chemical Information and Modeling. 46 (1): 380–91. doi:10.1021/ci050283k. PMID 16426072.
  38. ^ Hellgren M, Carlsson J, Ostberg LJ, Staab CA, Persson B, Höög JO (September 2010). "Enrichment of ligands with molecular dockings and subsequent characterization for human alcohol dehydrogenase 3". Cellular and Molecular Life Sciences. 67 (17): 3005–15. doi:10.1007/s00018-010-0370-2. PMID 20405162. S2CID 2391130.