DSSP (hydrogen bond estimation algorithm)


The DSSP algorithm is the standard method for assigning secondary structure to the amino acids of a protein, given the atomic-resolution coordinates of the protein. The abbreviation is only mentioned once in the 1983 paper describing this algorithm, where it is the name of the Pascal program that implements the algorithm Define Secondary Structure of Proteins.

Algorithm

DSSP begins by identifying the intra-backbone hydrogen bonds of the protein using a purely electrostatic definition, assuming partial charges of -0.42 e and +0.20 e to the carbonyl oxygen and amide hydrogen respectively, their opposites assigned to the carbonyl carbon and amide nitrogen. A hydrogen bond is identified if E in the following equation is less than -0.5 kcal/mol:
where the terms indicate the distance between atoms A and B, taken from the carbon and oxygen atoms of the C=O group and the nitrogen and hydrogen atoms of the N-H group.
Based on this, eight types of secondary structure are assigned. The 310 helix, α helix and π helix have symbols G, H and I and are recognized by having a repetitive sequence of hydrogen bonds in which the residues are three, four, or five residues apart respectively. Two types of beta sheet structures exist; a beta bridge has symbol B while longer sets of hydrogen bonds and beta bulges have symbol E. T is used for turns, featuring hydrogen bonds typical of helices, S is used for regions of high curvature, and a blank is used if no other rule applies, referring to loops. These eight types are usually grouped into three larger classes: helix, strand and loop.

π helices

In the original DSSP algorithm, residues were preferentially assigned to α helices, rather than π helices. In 2011, it was shown that DSSP failed to annotate many "cryptic" π helices, which are commonly flanked by α helices. In 2012, DSSP was rewritten so that the assignment of π helices was given preference over α helices, resulting in better detection of π helices. Versions of DSSP from 2.1.0 onwards therefore produce slightly different output from older versions.

Variants

In 2002, a continuous DSSP assignment was developed by introducing multiple hydrogen bond thresholds, where the new assignment was found to correlate with protein motion.