Description of the DSSP program

by Wolfgang Kabsch and Chris Sander

Function

Definition of secondary structure of proteins given a set of 3D coordinates.

Availability

Executables of the DSSP program are available from http://www.embl-heidelberg.de/dssp/ or from ftp.embl-heidelberg.de (192.54.41.33). Free academic use. For an academic source code license or for a commercial license go to http://www.embl-heidelberg.de/dssp/ or write to Chris Sander / email: sander@embl-heidelberg.de

Description

The DSSP program defines secondary structure, geometrical features and solvent exposure of proteins, given atomic coordinates in Protein Data Bank format. The program does NOT PREDICT protein structure. According to the Science Citation Index (July 1995), the program has been cited in the scientific literature more than 1000 times.

Authors of the DSSP method

Wolfgang Kabsch and Chris Sander, MPI MF, Heidelberg, 1983.
Reference: Kabsch,W. and Sander,C. (1983) Biopolymers 22, 2577-2637

Usage and command line options

  dssp [-na] [-v] pdb_file [dssp_file]
  dssp [-na] [-v] -- [dssp_file]
  dssp [-h] [-?] [-V]

Command line options:

-na
Disables the calculation of accessible surface.
-c
Classic (pre-July 1995) format.
-v
Verbose.
--
Read from standard input.
-h -?
Prints a help message.
-l
Prints the license information.
-V
Prints version, as in first line of the output.

Examples

In this example verbose mode was turned on to see the progress of execution for the large photoreaction center (1prc) input file.

    unix% dssp -v 1prc.pdb 1prc.dssp
     !!! Backbone incomplete for residue ALA  333 C
        residue will be ignored !!!
     
     !!! Residue SER  273 L has  3 instead of expected   2 sidechain atoms.
        last sidechain atom name is  OXT
        calculated solvent accessibility includes extra atoms !!!
     
     !!! Residue LYS  323 M has  6 instead of expected   5 sidechain atoms.
        last sidechain atom name is  OXT
        calculated solvent accessibility includes extra atoms !!!
     
     !!! Residue LEU  258 H has  5 instead of expected   4 sidechain atoms.
        last sidechain atom name is  OXT
        calculated solvent accessibility includes extra atoms !!!
     
     !!! Polypeptide chain interrupted !!!
    Inputcoordinates done        1189
    Flagssbonds done
    Flagchirality done
    Flaghydrogenbonds done
    Flagbridge done
    Flagturn done
    Flagaccess done
    Printout done

Output file is 1ppt.dssp


In this example the coordinates of avian pancreatic polypeptide (1ppt) were first converted from star format to pdb format and then piped into dssp.

    unix% star2pdb 1ppt.star | dssp -- > 1ppt.dssp
     !!! Residue TYR   36   has  9 instead of expected   8 sidechain atoms.
        last sidechain atom name is  OXT
        calculated solvent accessibility includes extra atoms !!!
    

Output file is 1ppt.dssp


Output

The output from DSSP on file myprotein.dssp contains secondary structure assignments and other information, one line per residue. Extract from 1est.dssp (simplified):

HEADER    HYDROLASE   (SERINE PROTEINASE)         17-MAY-76   1EST                
...
  240  1  4  4  0 TOTAL NUMBER OF RESIDUES, NUMBER OF CHAINS, 
                  NUMBER OF SS-BRIDGES(TOTAL,INTRACHAIN,INTERCHAIN)                .
 10891.0   ACCESSIBLE SURFACE OF PROTEIN (ANGSTROM**2)    
  162 67.5   TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(J)  ; PER 100 RESIDUES 
    0  0.0   TOTAL NUMBER OF HYDROGEN BONDS IN     PARALLEL BRIDGES; PER 100 RESIDUES 
   84 35.0   TOTAL NUMBER OF HYDROGEN BONDS IN ANTIPARALLEL BRIDGES; PER 100 RESIDUES 
...
   26 10.8   TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+2)
   30 12.5   TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+3)
   10  4.2   TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+4)
... 
  #  RESIDUE AA STRUCTURE BP1 BP2  ACC   N-H-->O  O-->H-N  N-H-->O  O-->H-N    
    2   17   V  B 3   +A  182   0A   8  180,-2.5 180,-1.9   1,-0.2 134,-0.1  

                                    TCO  KAPPA ALPHA  PHI   PSI    X-CA   Y-CA   Z-CA 
                                  -0.776 360.0   8.1 -84.5 125.5  -14.7   34.4   34.8

....;....1....;....2....;....3....;....4....;....5....;....6....;....7..
    .-- sequential resnumber, including chain breaks as extra residues
    |    .-- original PDB resname, not nec. sequential, may contain letters
    |    |   .-- amino acid sequence in one letter codeS
    |    |   |  .-- secondary structure summary based on columns 19-38
    |    |   |  | xxxxxxxxxxxxxxxxxxxx recommend columns for secstruc details
    |    |   |  | .-- 3-turns/helix  
    |    |   |  | |.-- 4-turns/helix  
    |    |   |  | ||.-- 5-turns/helix  
    |    |   |  | |||.-- geometrical bend
    |    |   |  | ||||.-- chiralityS
    |    |   |  | |||||.-- beta bridge label 
    |    |   |  | ||||||.-- beta bridge label 
    |    |   |  | |||||||   .-- beta bridge partner resnum
    |    |   |  | |||||||   |   .-- beta bridge partner resnum
    |    |   |  | |||||||   |   |.-- beta sheet label 
    |    |   |  | |||||||   |   ||   .-- solvent accessibility
    |    |   |  | |||||||   |   ||   |
  #  RESIDUE AA STRUCTURE BP1 BP2  ACC
    |    |   |  | |||||||   |   ||   |
   35   47   I  E     +     0   0    2
   36   48   R  E >  S- K   0  39C  97 
   37   49   Q  T 3  S+     0   0   86    (example from 1EST)
   38   50   N  T 3  S+     0   0   34   
   39   51   W  E <   -KL  36  98C   6 

Line length of output is 13x characters. Lines end in a number or a period.

Histograms:
the number 2 under column '8' in line 'residues per alpha helix' means: there are 2 alpha helices of length 8 residues in this data set.

For definitons, see above BIOPOLYMERS article.

In addition note:


Each line contains the following residue information

# RESIDUE

Two columns of residue numbers. First column is DSSP's sequential residue number, starting at the first residue actually in the data set and including chain breaks; this number is used to refer to residues throughout. Second column gives crystallographers' 'residue sequence number','insertion code' and 'chain identifier' (see protein data bank file record format manual), given for reference only.

AA

One letter amino acid code, lower case for SS-bridge CYS.

The values for solvent exposure may not mean what you think!

SECONDARY STRUCTURE

Compromise summary of secondary structure, intended to approximate crystallographers' intuition, based on columns 19-38, which are the principal result of DSSP analysis of the atomic coordinates.

First STRUCTURE Column:
H = alpha-helix
B = beta-bridge residue
E = extended strand (in beta ladder)
G = 3/10-helix
I = 5-helix
T = H-bonded turn
S = bend
Second STRUCTURE column
> = backbone CO of this residue makes H bond (i, i+n)
< = backbone NH of this residue makes H bond (i-n, i)
X = both CO and NH make H bond
3, 4, or 5 = number residues bracketed by H bond
Fifth STRUCTURE column
S = five-residue bend at residue i
Sixth STRUCTURE column
Sign of dihedral angle of CA, i-1 to i+2
+ = as in a right handed alpha-helix
- = as in an ideal twisted beta-strand
Seventh and eighth STRUCTURE columns
One-character name of beta-ladders in which residue i participates
UPPER CASE = antiparallel
LOWER CASE = parallel
Ladders are named sequentially from N- to C-terminus.
A beta-strand can be part of two ladders, one to each side, so there are two lines for the possible ladder partners. Each ladder name appears twice, once for each participating strand. Partner strands can thus be easily identified by identical letters. The sheet topology can be reconstructed by starting from a beta-strand and tracing all partners and their partners.

Ninth and tenth STRUCTURE columns

BP1 BP2

Residue number of first and second bridge partner followed by one letter sheet label

ACC

Number of water molecules in contact with this residue *10. or residue water exposed surface in Angstrom**2.

N-H-->O etc.

Hydrogen bonds; e.g. -3,-1.4 means: if this residue is residue i then N-H of I is h-bonded to C=O of I-3 with an electrostatic H-bond energy of -1.4 kcal/mol. There are two columns for each type of H-bond, to allow for bifurcated H-bonds.

TCO

Cosine of angle between C=O of residue I and C=O of residue I-1. For alpha-helices, TCO is near +1, for beta-sheets TCO is near -1. Not used for structure definition.

KAPPA

Virtual bond angle (bend angle) defined by the three C-alpha atoms of residues I-2,I,I+2. Used to define bend (structure code 'S').

ALPHA

Virtual torsion angle (dihedral angle) defined by the four C-alpha atoms of residues I-1,I,I+1,I+2.Used to define chirality (structure code '+' or '-').

PHI PSI

IUPAC peptide backbone torsion angles

X-CA Y-CA Z-CA

echo of C-alpha atom coordinates

Warnings

Unknown or unusual residues are named X on output and are not checked for standard number of sidechain atoms. All explicit water molecules, like other hetatoms, are ignored.

Input file

Coordinate file in PDB format.


Last modified at Yale: Tuesday, 16-Apr-1996 17:47:32 EDT