Scoring functions for docking


In the fields of computational chemistry and molecular modelling, scoring functions are mathematical functions used to approximately predict the binding affinity between two molecules after they have been docked. Most commonly one of the molecules is a small organic compound such as a drug and the second is the drug's biological target such as a protein receptor. Scoring functions have also been developed to predict the strength of intermolecular interactions between two proteins or between protein and DNA.

Utility

Scoring functions are widely used in drug discovery and other molecular modelling applications. These include:
A potentially more reliable but much more computationally demanding alternative to scoring functions are free energy perturbation calculations.

Prerequisites

Scoring functions are normally parameterized against a data set consisting of experimentally determined binding affinities between molecular species similar to the species that one wishes to predict.
For currently used methods aiming to predict affinities of ligands for proteins the following must first be known or predicted:
The above information yields the three-dimensional structure of the complex. Based on this structure, the scoring function can then estimate the strength of the association between the two molecules in the complex using one of the methods outlined below. Finally the scoring function itself may be used to help predict both the binding mode and the active conformation of the small molecule in the complex, or alternatively a simpler and computationally faster function may be utilised within the docking run.

Classes

There are four general classes of scoring functions:
The first three types, force-field, empirical and knowledge-based, are commonly referred to as classical scoring functions and are characterized by assuming their contributions to binding are linearly combined. Due to this constraint, classical scoring functions are unable to take advantage of large amounts of training data.

Refinement

Since different scoring functions are relatively co-linear, consensus scoring functions may not improve accuracy significantly. This claim went somewhat against the prevailing view in the field, since previous studies had suggested that consensus scoring was beneficial.
A perfect scoring function would be able to predict the binding free energy between the ligand and its target. But in reality both the computational methods and the computational resources put restraints to this goal. So most often methods are selected that minimize the number of false positive and false negative ligands. In cases where an experimental training set of data of binding constants and structures are available a simple method has been developed to refine the scoring function used in molecular docking.