CSB Home | Search | Table of Contents | General Information

Instructions for using APROPOS

To get started in the CSB core, type:

setup apropos (or source /srv/local/setup/apropos.set)

In the working directory there should now be links to APROPOS, to delcx and to mkalf (these should be made by your setup command) and at least two files, a parameter file (e.g. Parameter) and a file containing a list of PDB files to be calculated (e.g. Pdblist). You should edit the files Parameter and Pdblist as below.

Briefly, the most important parameters in Parameter which have to be set are:
       PDBLIST - name of file with list of molecules 
       OUTPUT - name of the output file.
For all other parameters the standard values can be used.
For the Pdblist file, the syntax is PDB filename (with path) and chains, e.g.
        ../example/pdb1mee.ent A 
which says that we want to examine only the molecule pdb1mee.ent in the directory, ../example and from this molecule only the atoms in the chain indicated by A.

The program is called by typing:

./apropos parameter_file


Output files:

  • pdb1mee.out ----- output of APROPOS; filename is given by OUTPUT; contains all detected pockets
  • pdb1mee.A ----- list of coordinates of the selected atoms and their atom radius (in our example: coordinates of the atoms in the chain A whithout hydrogen atoms); input file for delcx, mkalf and alvis; The atom radius works as a certain weigth factor for the alpha-shapes.
  • pdb1mee.A.dt ---- Delaunay triangulation of the set of points given by pdb1mee.A; generated by calling delcx pdb1mee.A via APROPOS
  • pdb1mee.A.alf --- family of alpha-shapes for the set of points given by pdb1mee.A; generated by calling mkalf pdb1mee.A via APROPOS. By fixing a certain parameter alpha one can select a special shape. The generation of the delaunay triangulation and the alpha-shape family is very time consuming.
  • pdb1mee.A.info -- information file for delcx and mkalf
  • delcx.output ---- log file of delcx
  • mkalf.output ---- log file of mkalf
  • pdb1mee.A.a*.fl - face lists of the set of points given by pdb1mee.A, i.e. the describtion of alpha-shapes to certain values of the parameter alpha as list of triangles. These face lists are generated by mkalf. For example pdb1mee.A.a4.0.fl is the alpha-shape of pdb1mee.A for alpha=4.0A. This family of face lists is the main object APROPOS works with. To save time APROPOS tests if the face lists with the desired parameter alpha (given by ALPHA_1, ALPHA_FAMILY, ALPHA_STEP and ALPHA_COMPARE) exist. If not, APROPOS looks if the respective Delaunay triangulation (*.dt) and alpha-shape family (*.alf) exist. Only if also these files does not exist APROPOS calls delcx and mkalf.
  • pdb1mee.A.sub --- output of the pockets as subsets readable for insightII
  • apropos.log ----- log of APROPOS.
  • Format of the output

    The first three lines contain some prameters APROPOS worked with. Then for each molecule the following are given :

    name of the input file
    the HEADER, COMPND, SOURCE and AUTHOR line from the PDB file
    the hetero atoms in the form: name, center and number of atoms forming the hetero atom
    the chain and number of amino acids considered for this molecule
    a describtion of the distribution of the depths of the surface atoms
    and the detected pockets including the number of atoms forming this pockets, the coordinates of the center of the pocket and three values describing the form of the pocket.

    The atoms forming a pocket are given as a list with

    the name of the atom
    its x, y and z coordinates
    the mean depth
    and the part of the pockets the atoms belongs to

    (APROPOS first finds only the atoms with the part 0. One can imagine that these are the deepest parts of the pockets. But if one wants to know what are the sides and the upper parts of a pocket one has to find which atoms on the surface of the molecule lay next to the deepest parts (part 1) and which atoms lay next to now these atoms (part 2) and so on. How far we go is determined by POCKET. For POCKET=2 one can imagine the atoms of part 1 as sides of the pocket and the atoms of part 2 as upper boundary of the pocket.)

    Note: To convert the above output file to a PDB file with all pocket atoms called HETATM use the C-shell script apropos2pdb. Usage is:

    apopos2pdb apropos_output_file

    You will get a file called "pockets.pdb"


    Parameter File Format

    OUTPUT (char [15]) name of the output file, should be set.

    default: OUTPUT Output

    PDBLISTE (char [15]) list of molecules.

    default: PDBLISTE Pdblist

    ALPHA_COMPARE (float) alpha-shape for the describtion of the global form of the molecule

    default: ALPHA_COMPARE 20.0

    ALPHA_1 (float), ALPHA_FAMILY (int), ALPHA_STEP (float) describe the family of alpha-shapes which will be compared with the global form; family is given by the shape with

    alpha = ALPHA_1 - ALPHA_FAMILYALPHA_STEP,
    alpha = ALPHA_1 - ALPHA_FAMILYALPHA_STEP + 1, ...,
    alpha = ALPHA_1 + ALPHA_FAMILYALPHA_STEP

    default: ALPHA_1 4.0, ALPHA_FAMILY 5, ALPHA_STEP 0.1

    SELECT_MAX (float) determines the level for the selecting of the deepest atoms (part 0 of the pockets)atom deepness > SELECT_MAX(maximal deepness of all surface atoms),

    default: SELECT_MAX 0.7

    SELECT_MIN (float) determines level for selecting highest atoms atom deepness < SELECT_MIN With this parameter you can try to find something like "hills" on the molecule surface. It works only if SELECT_MAX is set to a value >= 1,

    default: SELECT_MIN 0.5

    CUT_OFF (float) cut_off for the clustering of the selected atoms

    default: CUT_OFF 11.5

    POCKET (int) determines of far APROPOS extends the initial set of atoms (part 0 of the pocket)

    default: POCKET 2

    ALLCLUSTERS (int) switch for the form of the output of the pockets

    ALLCLUSTERS 1: output of all pockets,
    ALLCLUSTERS 0: output of only those pockets with >= (2ALPHA_FAMILY+1)(POCKET+1) atoms over all

    default ALLCLUSTERS 1

    ALLHETEROS (int) switch for the output of the hetero atoms

    ALLHETEROS 1: output of all hetero atoms,
    ALLHETEROS 0: output of only those hetero atoms withmore than 5 atoms,

    default: ALLHETEROS 1

    ATOMOUTPUT (int) switch for the form of the output of the atoms of the pockets

    ATOMOUTPUT 1: output of the atoms forming the pocket
    ATOMOUTPUT 0: output of only the amino acids name the atoms

    default: ATOMOUTPUT 1

    PARAMETER (int) switch for the output of the most important parameters.

    PARAMETER 1: output of values for PDBLISTE, ALPHA_1, ALPHA_FAMILY, SELECT_MAX, SELECT_MIN, CUT_OFF, POCKET,
    PARAMETER 0: not parameter output

    default:PARAMETER 1

    ONLYCNN (int) switch for the generation of only a list of coordinates

    ONLYCNN 1: only list of coordinates
    ONLYCNN 0: normal work,

    default: ONLYCNN 0

    VERSION (int) determines which version of alpha-shapes APROPOS uses

    VERSION < 3: unweighted alpha_shapes (dlecx is called detri),
    VERSION >= 3: weighted alpha_shapes (taking into account the atoms radii),

    default: VERSION 3

    DELETE (int) determines which of the generated files should be removed after calculation:

    a=1/0: the list of coordinates yes/no
    b=1/0: Delaunay triangulation and alpha-shape family yes/no
    c=1/0: face lists (.fl) yes/no,
    DELETE = 4c + 2b + a

    default: DELETE 3 (delete list of coordinates, Delaunay triangulation and alpha-shape family)

    SURFACE (float) output of atoms belonging to specified alpha-shape as PDB file in standard output SURFACE 0.0: no output

    default: SURFACE 0.0

    Pdblist File Format

    The list of molecules has the form

    path_to_the_molecule describtion_of_the_chains_to_read

    First of all APROPOS never reads the atoms belonging to a hetero atoms (indicated by HETATM in the PDB file) and APROPOS only reads the atoms with the element types C,N,S,O,A,Q (never hydrogen atoms).

    Then the describtion of the chains to read can be

    empty, i.e. APROPOS reads all atoms
    or
    consist of one ore more chain idicators separated by commas.

    Each chain indicator can be followed by

    -NUMBER or +NUMBER.

    If -NUMBER follows APROPOS reads all atoms of amino acid with a number <= NUMBER, if +NUMBER follows APROPOS reads all atoms of amino acids with number >= NUMBER. If only the chain indicator is written APROPOS reads all atoms of this chain.

    Example:

    /PDB/pdb1ace.ent     read all atoms 
    
    /PDB/pdb7aat.ent     A read only chain A
    
    /PDB/pdb1lts.ent     D,E,F,G,H read chains D,E,F,G and H 
    
    /PDB/pdb1mda.ent     H+25,L,A read from chain H atoms with amino acid
                         number >= 25 and all atoms from chains L and A
    
    /PDB/pdb1ncc.ent     L-107,H-112,N read from chain L all amino acids
                         with number <= 107, from chain H all amino acid 
                         with number <= 112 and all atoms of chain N.
    

    Posible errors

    specified list of molecules (PDBLISTE) does not exists
    ----> check file names

    molecules in the list cannot be found
    ----> PDB code in the subset file (first three letter in the line succseeding the line "@line subset ...") does not match with the PDB code insigth II found for the molecule,

    ----> change the PDB code in the subset file

    insightII cannot create subsets
    ----> subset file contains empty subsets

    ----> delete these empty subsets

    totally wrong detected pockets
    ----> used face list and considered set of points do not match

    ----> delete face list (and triangulation *.dt and alpha-shape family *.alf) and calculate everything once again.


    CSB Home | Search | Table of Contents | General Information
    Center for Structural Biology (www.csb.yale.edu), Yale University (www.yale.edu)
    Contact: webadmin(at)mail^csb^yale^edu
    Last Modified: Friday, 26-Mar-1999 14:59:25 EST by P. Fleming