Uppsala Software Factory

Uppsala Software Factory - MOLEMAN2 Manual


1 MOLEMAN2 - GENERAL INFORMATION

Program : MOLEMAN2
Version : 030221
Author : Gerard J. Kleywegt, Dept. of Cell and Molecular Biology, Uppsala University, Biomedical Centre, Box 596, SE-751 24 Uppsala, SWEDEN
E-mail : gerard@xray.bmc.uu.se
Purpose : manipulation and analysis of PDB files
Package : X-UTIL


2 REFERENCES

Reference(s) for this program:

* 1 * G.J. Kleywegt (1995). Dictionaries for Heteros. CCP4/ESF-EACBM Newsletter on Protein Crystallography 31, June 1995, pp. 45-50. [http://xray.bmc.uu.se/usf/factory_5.html]

* 2 * G.J. Kleywegt (1996). Making the most of your search model. CCP4/ESF-EACBM Newsletter on Protein Crystallography 32, June 1996, pp. 32-36. [http://xray.bmc.uu.se/usf/factory_6.html]

* 3 * G.J. Kleywegt & T.A. Jones (1996). Phi/Psi-chology: Ramachandran revisited. Structure 4, 1395-1400. [http://www4.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=8994966&form=6&db=m&Dopt=r]

* 4 * G.J. Kleywegt & T.A. Jones (1997). Model-building and refinement practice. Methods in Enzymology 277, 208-230. [http://xray.bmc.uu.se/gerard/gmrp/gmrp.html]

* 5 * G.J. Kleywegt (1997). Validation of protein models from CA coordinates alone. J Mol Biol 273, 371-376. [http://www4.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=9344745&form=6&db=m&Dopt=r]

* 6 * G.J. Kleywegt (1999). Experimental assessment of differences between related protein crystal structures. Acta Cryst. D55, 1878-1857. [http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=10531486&form=6&db=m&Dopt=b] [http://journals.iucr.org/d/issues/1999/11/00/se0283]

* 7 * G.J. Kleywegt (2000). Validation of protein crystal structures. Acta Cryst. D56, 249-265 (Topical Review). [http://journals.iucr.org/d/issues/2000/03/00/gr0949]

* 8 * G.J. Kleywegt (2001). Validation of protein crystal structures. In: "International Tables for Crystallography, Volume F. Crystallography of Biological Macromolecules" (Rossmann, M.G. & Arnold, E., Editors). Chapter 21.1, pp. 497-506, 526-528. Dordrecht: Kluwer Academic Publishers, The Netherlands.

* 9 * Kleywegt, G.J., Zou, J.Y., Kjeldgaard, M. & Jones, T.A. (2001). Around O. In: "International Tables for Crystallography, Vol. F. Crystallography of Biological Macromolecules" (Rossmann, M.G. & Arnold, E., Editors). Chapter 17.1, pp. 353-356, 366-367. Dordrecht: Kluwer Academic Publishers, The Netherlands.


3 VERSION HISTORY

951107 - 0.1 - initial programming
951108 - 0.2 - more (REad and APpend commands)
960216 - 0.3 - more (some BF and OC commands; implemented main/side; SElections; first documentation)
960217 - 0.4 - WRite command; PDb CRystal/HEtero; PRotein MC_analysis
960222 - 0.5 - PRotein SC_analysis
960223 - 0.6 - PRotein CA_analysis; COnstant; BF BOnded; BF SMooth; STatistics; more SElect ANd/OR options; some XYz commands
960226 - 0.7 - more XYz commands; BF/OC PRod_plus; first CHain commands
960227 - 0.8 - SPlit command; more CHain commands; more PDb commands
960301 - 0.9 - SElect NUmeric; LIst_selected; ONo RSr/FIt/COnnect/TOrsion
960303 - 0.10 - implemented macro facility; split PDb REmark command into three separate ones (LIst, DElete, REmark); changed some parameters from optional to required to make them useful for macros (e.g., XYz ROtate/TRanslate); ONo DIsulfide and ONo WAter_fit
960304 - 0.11 - ONo OOps; DIstance PLot, DIstribution, SHort, SElect; GEometry_selected; first useful version for Uppsala
960311 - 0.12 - PDb NAme/NUmber; ONo XPlor_hydrogens; SQuence LIst/PIr; SQuence GLyco_sites/MOtif
960405 - 0.13 - minor bug fixes; MUlti_geom
960408 - 0.14 - XYz CEntre_origin and XYz ALign_inertia_axes; first general release
960410 - 0.15 - add END card at end of PDB file (oops ...)
960411 - 0.16 - debug generate.inp file for X-PLOR with SPlit command; debug write CCP4 format (include cryst1 etc. cards)
960412 - 0.17 - implemented LS_plane and ONo LS_plane_odl; SQuence COunt, EXtinction_280; added optional parameter to the "?" command which may be the name of any command that has sub-commands; DELETE_molecule command; AUto SPink, BOnes
960414 - 0.18 - AUto SSe; lot of debugging in the core AUto subroutine to fix instabilities -> the new algorithm appears to be stable (and no longer has a random component ;-)
960415 - 0.19 - minor bug fixes
960416 - 0.20 - minor bug fixes
960513 - 0.21 - correct naming of OT1/OT2 with SPlit command; includes automatic generation of OT2 if necessary !
960517 - 0.22 - implemented simple SYMBOL mechanism (& command)
960520 - 0.23 - check for atoms which have X~Y~Z when reading a PDB file (and in the BOok_keep command); new PDb command CHemical+charge to add the symbol of the chemical element and the charge to columns 77-80 of the ATOM and HETATM records
960521 - 1.0 - added optional "use_masses" parameter to the XYz CEntre_origin command; this version stable and useful enough for goverment work
960629 - 1.0.1 - several small bug fixes made while at Yale
960801 - 1.0.2 - added code to PRotein CA_analysis command to look for sequential stretches of poor residues; this also seems to detect some register errors !
960802 - 1.0.3 - vastly improved ONo LS_plane command
960804 - 1.0.4 - make border around atoms a parameter for ONo LS_plane
960805 - 1.1 - MUlti_geometry now correctly averages dihedrals, i.e., using RTODEG*ATAN2(AVESIN,AVECOS)
961101 - 1.1.1 - change X-PLOR "generate" input files from SPlit command so as to delete hydrogens and atoms with unknown coordinates
970124 - 1.1.2 - minor bug fix (SE ? would crash on Alphas)
970211 - 1.1.3 - the SQ PIr command to create a PIR file now only writes one-letter code for residues for which at least one atom has been selected (so you can easily avoid getting hundreds of '?' residues for your waters etc.)
970626 - 1.2 - support initialisation macro (setenv GKMOLEMAN2 macrofile)
970701 - 1.2.1 - check for weird B-factors and occupancies while reading a new PDB file
970714 - 1.2.2 - implemented X-PLOR polars and X-PLOR/Lattmann Euler angles in XYz ROtate
970723 - 2.0 - implemented VRML commands
970724 - 2.1 - added VRml CEll command; SElect NUmeric can now also select on atomic Mass, Covalent bond radius and chemical Element number; added SElect BUtnot, SElect BY_residue and SElect Dist_to_sel; SElect NUmeric can now have AND, OR or BUTNOT; VRml FAt_trace
970729 - 2.1.1 - fixed bug in calculation of radius-of-gyration
970807 - 2.1.2 - allow "?" wildcards in atom names in library file (e.g., some people call their water oxygen " O ", others " O1 ", " OHH", etc.; use " O??" in the library to capture all of these; similarly for metal ions)
970924 - 2.1.3 - new PDb NO_atom_numbers command to remove O-style atom numbers (indicating chemical element type)
980420 - 2.1.4 - fixed bug in SElect DIst command (wrong parameters were passed to the subroutine)
981009 - 2.1.5 - improved macro generated by ONo OOps_macro command
981014 - 2.1.6 - correct ONo COnnect and TOrsion datablocks even if hydrogen atoms are present
981021 - 2.1.7 - new ECho command to echo command-line input (useful in scripts)
981022 - 2.2 - implemented command history (# command)
981216 - 2.2.1 - added some comments to output PostScript files
990223 - 2.2.2 - doubled max nr of atoms and residues; removed "on_off" commands from O macros generated by MOLEMAN2
990301 - 2.2.3 - echo some PDB header lines when reading a PDB file
990504 - 2.3 - ANISOU cards are now read and written - the SElect ANd, OR and BUtnot commands can now also be used with the attributes ALtloc (alternative location identifier, e.g. A, B, X, " ", etc.) and ANisou (can be either T(rue) or F(alse)) - new BFactor NO_anisou command to delete all ANISOUs - up to 20 least-squares planes (ONo LS_plane command) can be stored, and their mutual angles calculated with the new ONo ANgle_ls_planes command
990823 - 2.3.1 - the STats, BFactor STats and OCcupancy STats commands now also list the RMS values and the harmonic averages of the B-factors and occupancies
990924 - 2.4 - new VRml CRamp_selection, SPhere, CYlinder, and LIquorice commands; debugged some of the VRML-generating routines
990930 - 2.4.1 - in Ramachandran plots (PRot MC), D-amino acids are now treated explicitly (their -phi and -psi are used to assess if they're outliers or not; in the PostScript plot they will be shown as red diamonds)
991029 - 2.4.2 - fixed bug in XYZ PErturb command (shift for B and Q used to be equal to the shift for Z, no matter what ...)
991130 - 2.4.3 - increased max. number of atoms to 500,000 and max. number of residues to 50,000
991213 - 2.5 - implemented YASSPA routine (invoked during book-keeping), the results can be used in SElect NUmeric, SElect ANd, SElect OR, and SElect BUtnot (e.g., to colour helices and strands differently in a VRML world)
991221 - 2.6 - several bug fixes to get it to work properly with Linux/g77
000310 - 2.6.1 - minor bug fix for CHain AUto command (used to coredump on Alphas)
000313 - 2.6.2 - minor bug fix in REad command (Linux version choked on some CRYST1 cards)
000526 - 2.7 - new ONo RIngs command (requires new version of moleman2.lib library file !) to generate ODL files to draw the rings as semi-transparent solid planes (e.g., Tyr, Trp, etc.)
000529 - 2.7.1 - minor change
001113 - 2.7.2 - PDb FArout command to generate quick-n-dirty HELIX and SHEET records for use with FarOut
001117 - 2.8 - new NUcleic DUarte_pyle command to make "Ramachandran-like" plots to help analyse the conformation of RNA and DNA
001130 - 2.8.1 - new VRml CLose_file command
001229 - 2.8.2 - minor changes
010725 - 2.8.3 - new ONo CEll command
010727 - 2.8.4 - minor changes
010803 - X - added a number of pictures to this manual to illustrate the results of some commands
010816 - 2.8.5 - minor changes
010905 - 2.8.6 - DIstance LIst command (e.g., to find atoms that are too close to one another)
011023 - 2.8.7 - minor changes
011114 - 2.9 - PRot MC, PRot CA and Nucleic DUarte commands now delete the PostScript file they normally produce if it contains no residues; all these three commands now also have an extra (optional) argument to decide whether the analysis should be carried out for all chains in one go, for one chain in particular, or for each chain in turn (in the latter case, you will get one PostScript file for each chain, so this makes it easy to inspect Ramachandran plots on a chain-by-chain basis)
011115 - 2.9.1 - PRot MC, PRot CA and Nucleic DUarte option with "_" as a chain argument now also work if a chain has a blank chain ID
011123 - 2.9.2 - minor bug fix
011220 - 2.9.3 - minor change
020129 - 2.9.4 - minor changes
020221 - 2.9.5 - the STatistics command now also prints the range of X, Y, and Z coordinates (together with the XYz ALign command, this enables you to determine the dimensions of your molecule)
020514 - 2.9.6 - new ONo MOlray command to generate a trace pseudo-molecule if you want to "fly" along a protein chain in a MolRay movie
020516 - 2.9.7 - more options for ONo MOlray command
020611 - 2.9.8 - new OCcupancy PLot command
020628 - 3.0 - new DIstance CHains command to quickly find contacts between two different chains (e.g., two monomers in a dimer, or protein vs. DNA, or protein vs. ligand)
020718 - 3.0.1 - new XYz AXis_rotate command to rotate around the X, Y or Z axis by a user-specified angle
020729 - 3.0.2 - optional 'how' parameter for the ONo DIsulfide command (default = S, draw as sticks; alternative L, draw as lines, i.e. like normal bonds) - suggested by Marko Hyvonen. Also optional ODL object name (in case you want more than one SS-bond object)
020819 - 3.0.3 - renamed the VRml LIst_colours command to VRml NAmed_colours (to avoid a clash with the VRml LIquorice command ... thanks to Kevin Battaile for pointing this out)
020827 - 3.0.4 - ODL objects produced by same of the ONo commands now have default names that begin with an underscore to prevent any clashes with POV-Ray terms (such as "plane")
020927 - 3.0.5 - new ONo INertia_axes_odl command to generate an ODL file to draw the axes of inertia for the selected set of atoms (following a question from Michael Merckel)
021023 - 3.0.6 - improved macro produced by ONo OOps command
021030 - 3.0.7 - (while in Toronto) further improved macro produced by ONo OOps command
021111 - 3.1 - minor bug fixes; new PDb SAnity_check command
021114 - 3.1.1 - fix: now uses insert code to delineate different residues (in addition to residue number, type, chain name, and seqment ID)
021121 - 3.1.2 - further improved macro produced by ONo OOps command
030221 - 3.2 - new BFactor PSeudo, BF SAve, BF REstore, BF SCale, and BF ODb commands to replace B-values by other properties, e.g. for colour-ramping in O or Rasmol. With BF PSeudo you can also calculate B-valeus predicted using Halle's method


4 START-UP MACRO

From version 1.2 on, MOLEMAN2 can execute a macro at start-up (whether it is run interactively or in batch mode). This can be used to execute commands which you (almost) always want to have executed. To use this feature, set the environment variable GKMOLEMAN2 to point to a MOLEMAN2 macro file, e.g.:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 setenv GKMOLEMAN2 /home/gerard/moleman2.init
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5 INTRODUCTION

MOLEMAN2 is a new version of the old MOLEMAN program. It can be used for all sorts of manipulation and analysis of PDB files. It is intended primarily for practicing crystallographers who need to do hundreds of little things to their PDB files when switching between different programs, etc. Users of O, CCP4 and X-PLOR will benefit most from the functionality of this program.

The user-interface is different from that of the old MOLEMAN, and more similar to that of MAPMAN, DATAMAN, etc. I.e., instead of question-and-answer game you can supply any or all parameters for a command on one line. For example, to read a PDB file, "re file.pdb" is enough. The other two parameters of the REad command (format and option to read hydrogens) will be set to their default values. For all commands, the first two characters are unique, so "re" is the same as "read", etc. Optional parameters are enclosed in [square brackets] in the list of commands.
NOTE: the DELETE_molecule command is an exception and requires the first six characters to be typed (i.e., the word "delete"); this is to reduce the risk of accidental deletion of your molecule !

An important difference with the old MOLEMAN is the fact that you can select subsets of atoms which will be used by many commands. In this fashion, you can use the same command ("bfactor stats") to get statistics about all atoms, all protein atoms, all main-chain atoms in segment XYZ1, etc. This makes the program much more flexible and easier to maintain (since no special-purpose options are necessary for different possible subsets of atoms).

Another important difference is that a library file is used which contains information about residues, such as their constituent main-chain and side-chain atoms, their type (protein, metal, carbohydrate, etc.), aliases (e.g., waters may be called WAT, HOH, H2O, etc.), and so on. You can use residue types in your selections so that it is very easy to get B-factor statistics for all non-hydrogen carbohydrate atoms with segment id CRB1, for instance.

MOLEMAN2 also allows you to write and execute macros for series of commands that you execute often (e.g., when going from a new X-PLOR model to a PDB file suitable for O and CCP4).

Many commands have a built-in mini-help facility which explains what the parameters to the command are or what values they may have. For instance, if you type "write ?" the program will explain what the parameters are:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Syntax: WRite file [format] [which]
 file   = PDB file name
 format = Pdb | Xplor | Ccp4
 which  = ALl | NO_hydro | SElected | PAla |
          PGly | PSer | CAlpha
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Parameters in [square brackets] are optional and will default to the first value listed (e.g., in this example, the default is to write ALl atoms to a Pdb-formatted file).

For parameters which can have several values, the UPPERCASE characters show how many characters define a unique value. In the example above, you can enter P, X or C for the format, but for the "which" parameter you must supply (at least) *two* characters.

You will be prompted to supply values for all parameters that you do not type on the command line, except those for which default values exist. Usually, the values suggested by the program make sense (if not, let me know).

If you need to provide a text parameter which contains spaces (such as the spacegroup symbol for the PDB CRYST1 record), enclose the whole string in "double quotes". Otherwise, one or more blanks and/or tabs are used to delimit parameter values.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pd cr 84 84 111.8 90 90 90 8 "P 21 21 2"
 Unit-cell axes (A)      : (  84.000   84.000  111.800)
 Unit-cell angles (deg)  : (  90.000   90.000   90.000)
 Unit-cell volume (A3)   : (  7.889E+05)
 Nr of molecules in cell : (       8)
 Spacegroup symbol       : (P 21 21 2)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

The dimensioning of the program (e.g., the maximum number of atoms, etc.) is shown at startup. If you need a bigger version. let me know.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Array dimensioning:

1) Library: Max nr of residue types : ( 200) Max nr of atom types : ( 2000) Max nr of residue aliases : ( 100) Nr of defined residue classes : ( 100)

2) Molecule: Max nr of atoms : ( 100000) Max nr of residues : ( 10000) Max nr of REMARK records : ( 1000) Max nr of other records : ( 1000)

3) Program: Max buffer size : ( 524288) Max nr of atoms per residue : ( 100) Max nr of residue torsions : ( 100) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


6 MACROS

In MOLEMAN2 you can also use macros (as in MAMA). A macro is a small text file containing MOLEMAN2 commands (but usually few parameters; these are left to the user to enter on demand) and comments. A simple macro to convert an X-PLOR PDB file into one suitable for O and CCP4 may look as follows:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
! xplor_to_ccp4.momac - gj kleywegt @ 960303
!
! MOLEMAN2 macro to go from an X-PLOR PDB file with segment ids
! and hydrogen atoms, to a ccp4/o file with chain names and
! no hydrogens etc.
!
! Enter X-PLOR PDB file name:
read
!
! Some information about the molecule(s)
statistics
!
! Generate chain names from segment IDs
chain from_segid auto
!
! Enter cell constants etc.:
pdb crystal
!
! Enter * to delete all X-PLOR remarks:
pdb delete_remark
!
! Enter a descriptive remark about the file:
pdb remark
!
! Enter CCP4/O PDB file name:
write
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

When executed (with the @ command), this will give the following:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > @xplor_to_ccp4.momac
 ... Opened macro file : (xplor_to_ccp4.momac)
 ... On unit : (      61)
 > (!)
 > (! xplor_to_ccp4.momac - gj kleywegt @ 960303)
 > (!)
 > (! MOLEMAN2 macro to go from an X-PLOR PDB file with segment ids)
 > (! and hydrogen atoms, to a ccp4/o file with chain names and)
 > (! no hydrogens etc.)
 > (!)
 > (! Enter X-PLOR PDB file name:)
 > (read)
 PDB file ? (m1.pdb) hydro.pdb
 Reading from file : (hydro.pdb)
 ...
 > (! Some information about the molecule(s))
 > (statistics)
 Nr of atoms    : (       5794)
 Nr of residues : (        802)
 ...
 > (! Generate chain names from segment IDs)
 > (chain from_segid auto)

RESIDUE ALA 86 AAAA New chain name : (A)

RESIDUE NAG 501 BBBB New chain name : (B) ... > (! Enter cell constants etc.:) > (pdb crystal) A axis (A) ? ( 1.00) 49.1 B axis (A) ? ( 1.00) 75.8 C axis (A) ? ( 1.00) 92.9 Alpha angle (deg) ? ( 90.00) Beta angle (deg) ? ( 90.00) 103.2 Gamma angle (deg) ? ( 90.00) Nr of molecules in cell ? ( 1) 4 Spacegroup symbol ? (P 1) P 21 ... > (! Enter * to delete all X-PLOR remarks:) > (pdb delete_remark) Which ? (-1) * Delete all REMARK records > (!) > (! Enter a descriptive remark about the file:) > (pdb remark) Text ? (???) Model M3 @ 960303 R=0.231 Rfree=0.273 Add REMARK record : (Model M3 @ 960303 R=0.231 Rfree=0.273) 1: REMARK Model M3 @ 960303 R=0.231 Rfree=0.273 > (!) > (! Enter CCP4/O PDB file name:) > (write) PDB file ? (hydro.pdb) m3.pdb Output PDB file : (m3.pdb) Format : (Pdb) Atoms : (ALl) ... ... End of macro file ... Control returned to terminal ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

Generally useful macros will be made available via the public domain OMAC directory (/nfs/public/omac in Uppsala; pub/gerard/omac for downloading from other sites).

Note that macros may be nested (the level depends on how many files may be open at the same time on your paticular type of machine). For instance, a macro which converts a new X-PLOR model into a PDB for O/CCP4, and does some quality analysis, and lists B-factor statistics may look as simple as this (assuming your directory contains a soft link called "omac" to your local OMAC directory):

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
!
! new_model.momac - gj kleywegt @ 960303
!
! MOLEMAN2 macro to generate an O/CCP4 PDB file from an X-PLOR PDB
! file; analyse main-chain, side-chain and CA-geometry, and list
! B-factor statistics
!
@omac/xplor_to_ccp4.momac
@omac/prot_qual.momac
@omac/bfac_stats.momac
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


7 MOLEMAN2 VERSUS MOLEMAN

A number of features and options from the old MOLEMAN have been dropped (but many more have been improved ;-). If you need any of these, use the old MOLEMAN program. Dropped features include:

- Balasubramanian plots
- HPGL and O2D files for Ramachadran plots
- occupancy plots (if someone really needs them, let me know)
- averaging temperature factors over different chains
- BAD files (i.e., internal coordinates, Bond-distances, Angles, Dihedrals)
- flag-colour datablocks (e.g., to colour your molecule according to the Dutch or Swedish flag)

Note that MOLEMAN2 versions below 1.0 are still development versions, so not all the functionality may have been implemented yet !

The following list shows the old MOLEMAN commands and their counterparts in the new MOLEMAN2 (up-to-date at 960304):

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 READ_pdb_file             = REad
 NO_H_read                 = REad
 ALWYn_format_read         = REad
 APPEnd_pdb_file           = APpend
 WRITe_pdb_file            = WRite
 DUMP_pdb_file             = WRite
 SPLIt_pdb_file            = SPlit
 EXPOrt_bad_file       = NOT SUPPORTED
 IMPOrt_bad_file       = NOT SUPPORTED
 SAME_export           = NOT SUPPORTED
 HELIx_generate            = AUto SPink/BOnes/SSe
 STRAnd_generate           = AUto SPink/BOnes/SSe
 QUIT                      = QUit

REMArk_etc_cards = PDb REmark/LIst_remark/DElete_remark CRYStal_PDB_card = PDb CRystal SSBOnd_records = PDb SSbond PIR_sequence_file = SQuence PIr GLYCo_sites = SQuence GLyco_sites EXTInction_280 = SQuence EXtinction_280 TALLy_residues = SQuence COunt COUNt_elements = not implemented yet WATEr_sort = not implemented yet

RSR_datablock = ONo RSr CONNect_file = ONo COnnect TORSion_datablock = ONo TOrs RSFIt_datablock = ONo FIt DISUlfide_ODL_file = ONo DIsulfide FIT_water_macro = ONo WAter_fit FLAG_colours = NOT SUPPORTED

STATistics = STatistics PLOT_Bs_or_Qs = BFactor PLot RADIal_B_plot = BFactor PLot RAMAchandran_plot = PRotein MC_analysis PLANar_peptides = PRotein MC_analysis BALAsubramanian_plot = NOT SUPPORTED CHI_list = PRotein SC_analysis CA_Ramachandran_plot = PRotein CA_analysis CACA_distances = PRotein CA_analysis CA_Distance_plot = DIstance PLot LIST_residue = LIst_selected GEOMetry_list = GEometry_selected SEQUence_list = LIst_selected/SQuence LIst BURIed_charges = not implemented yet DISTance_distribution = DIstance DIstribution SHORt_contacts = DIstance SHort

LIMIt_B_and_Q = BFactor/OCcupancy LImit AVERage_temp_factors = BFactor GRoup TEMP_factors_set = BFactor LImit/PRod_plus OCCUpancies_set = OCcupancy LImit/PRod_plus SMOOth_Bs = BFactor SMooth B_Q_statistics = BFactor/OCcupancy STats BONDed_Bs = BFactor BOnded NONBonded_Bs = BFactor BOnded

O2XHydrogens = ONo XPlor_hydrogens SUGGest_OT2 = CHain OT2_suggest RENUmber_atoms = NOT SUPPORTED ALTEr_residue_name = PDb NAme RESIdu_renumber = PDb NUmber ZONE_renumber = PDb NUmber CHECk_nomenclature = PRotein SC_analysis CORRect_nomenclature = not implemented yet CHAIn_name = CHain NAme_selection/REname XPLOr_ids = CHain NAme_selection/SEgid_rename FROM_chain_to_XID = CHain TO_segid XID_to_chain = CHain FRom_segid AUTO_chain_segid = CHain AUto ASK_auto_chain_segid = CHain ASk

FRACtional_to_cartesian = XYz ORthogonalise CARTesian_to_fractional = XYz FRactionalise ROTAte_molecule = XYz ROtate TRANslate_molecule = XYz TRanslate APPLy_random_rotation = XYz RAndom_rotation RANDom_shifts = XYz PErturb ORIGin_move = XYz CEntre_origin MIRRor_zone = XYz MIrror INVErt_zone = XYz INvert ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

On the whole, MOLEMAN2 provides a superset of the functionality of the old MOLEMAN, with many additional benefits:
- removed bugs
- more use of (sensible) defaults to minimise typing (compare the new WRite command to the one in the old MOLEMAN)
- use of atom selections to operate on only a subset of the atoms
- combined commands (e.g., the PRotein MC_analysis combines several old MOLEMAN commands, plus new functionality)
- extended functionality (e.g., the XYz commands are much more flexible than the combined set of old MOLEMAN commands they replace; also the SPlit command will now auto-generate a GENERATE input file for X-PLOR which usually requires little editing; the PDb HEtero command is new)
- the program is now easier to extend since I have actually thought a bit about my data structures (not too much, of course, I'm still a Fortran relic). The old MOLEMAN was initially written to do just two simple things: add X-PLOR segment IDs and write an END card at the end of PDB files ;-) Since then it has outgrown itself rapidly, making it a major pain to implement or change functionality.


8 MOLEMAN2 AND PDB FILES

Not written yet.


9 SECONDARY STRUCTURE

As of version 2.5, MOLEMAN2 automatically determines the secondary structure of protein residues in its book-keeping stage (e.g., when a molecule is read in). This is done using the YASSPA algorithm and should give identical results to those you get in O (not necessarily for left-handed helices, though). There are five possible "states":

- 0 = loop or turn
- 1 = alpha helix (right-handed)
- 2 = beta strand
- 3 = left-handed alpha helix
- -1 = non-protein residues

The results can be used with the SElect commands, so you can list all left-handed helical residues, colour the helices blue in your VRML world, write out only the residues in beta strands, calculate the average temperature factor for all helical residues, etc.

If you know what you are doing, you can change the cut-off values for the algorithm using the COnstants commands (they are YALCUT, YBECUT and YLHCUT).


10 LIBRARY

The default library file in Uppsala can be found in /nfs/public/lib, with name "moleman2.lib". If you set the environment variable GKLIB in your .cshrc file to /nfs/public/lib, the program will always come up with the correct default name for this name. Outside Uppsala, the library file can be found in directory pub/gerard/xutil in the compressed tar file xutil_etc.dirtar.gz

The library file must be read on start-up. The program will prompt you for the filename.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 ...
    Max nr of residue torsions    : (        100)

Name of library file ? (/nfs/public/lib/moleman2.lib)

Reading library ... > ( = MOLEMAN2.LIB = VERSION 0.1 = 951106 = GJ KLEYWEGT = 118 ENTRIES =)

Lines read : ( 785) Residue types : ( 118) Atom types : ( 1314) Aliases : ( 60)

First and last residue types:

Residue # 1 = GLY (PROT) = GLYCINE Atoms | N | (T) | CA | (T) | C | (T) | O | (T)

Residue # 118 = TRS (ORGA) = TRIS TRIS(HYDROXYMETHYL)-AMINOMETHANE Atoms | O1 | (F) | C2 | (F) | C3 | (F) | C4 | (F) | O5 | (F) | C6 | (F) Atoms | O7 | (F) | N8 | (F)

Check integrity: WARNING - name or alias conflict: 55 = MAL and 95 = MAL WARNING - name or alias conflict: 60 = GLC and 61 = GLC WARNING - name or alias conflict: 61 = GLC and 60 = GLC WARNING - name or alias conflict: 63 = MAN and 64 = MAN WARNING - name or alias conflict: 64 = MAN and 63 = MAN WARNING - name or alias conflict: 95 = MAL and 55 = MAL ERROR --- Non-unique residue names/aliases

Count types: Nr of amino acid residue types : ( 22) Nr of nucleic acid types : ( 4) Nr of water types : ( 1) Nr of metal types : ( 13) Nr of inorganic types : ( 12) Nr of carbohydrate types : ( 18) Nr of organic compound types : ( 48) Nr of other compound types : ( 0)

MOLEMAN2 commands : ... ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

As you can see, there are some duplicate names (not necessarily for identical compounds, e.g. alpha- and beta-glucose can both be called GLC). The program will use the first occurrence.

The format is as follows (in case you want to edit the file or add new residue type definitions):

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 For each residue type:

RES XYZ description FORMAT: (A3,1x,A3,1x,A80) - XYZ = 3-letter residue type - description = free text TYP ABCDefghijkl FORMAT: (A3,1x,A4) - ABCD = one of the defined categories, e.g. protein AKA ABC DEF GHI FORMAT: (A3,n(1x,A3)) (multiple cards allowed) - 0 or more synonyms for the residue type MCH ... FORMAT: first card (A3,1X,*) subsequent cards (4X,*) - 0 or more atom names which constitute the main chain or backbone SCH ... FORMAT: first card (A3,1X,*) subsequent cards (4X,*) - 0 or more atom names which constitute the side chain END FORMAT: (A3) - signals end of residue definition

in MCH and SCH lines, use a "-" at the end of a line to signal continuation on the next one; start the next line with 5 spaces !

any line beginning with "!" is a comment which will not be printed

any line beginning with REM is a comment which will be printed ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

The pre-defined residue types are: PROTein, NUCLeic acid, WATEr, METAl ions, INORganic ions and clusters, CARBohydrates, and ORGAnic ligands, ions, substrates, co-factors etc. Anything else will be classified as HETEro.

An example:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
RES CYS cysteine
TYP protein
AKA CSS CSH CYH CYX
MCH ' N  ' ' CA ' ' C  ' ' O  '
SCH ' CB ' ' SG '
END
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

As of version 2.7, ring definitions can also be included (used by the ONo RIngs command).

If you want your special residue/ligand to be included in the standard distribution version of the library, or if you find any errors, E-mail me (gerard@xray.bmc.uu.se).

If you have compounds which are not in the library, it usually doesn't matter too much, as long as you realise that:
- the compound will be assigned type HETEro; if it is in actual fact an unusual amino acid you should add it to the library
- all atoms will be flagged as being side-chain atoms; again, if you have unusual amino acids or nucleotides, this may not be what you want
- no check for missing/superfluous atoms can be carried out (BOok_keep command)


11 PLOT FILES

Some commands produce plot files for O2D. On an SGI, you can view these plots interactively, and analyse them. On other machines, you can convert them to PostScript (or CricketGraph) files. Use the script OMAC/o2dps to do the conversion to PostScript automatically (and for lots of files at once, if you like).

Other commands directly produce PostScript files. Use your local viewer (e.g., ghostview or ghostscript) to look at these, and/or print them on a PostScript printer.


12 PROGRAM PARAMETERS

With the COnstants command a number of program parameters can be altered by the user. These include:

- BLIMLO/BLIMHI/QLIMLO/QLIMHI - used by the BFacor/OCcupancy LImit commands
- MXCACA - maximum CA-CA distance for connected residues (used by several commands)
- MXPP - maximum P-P distance for connected nucleic acid residues
- TORTOL - tolerance for certain impropers/dihedrals (e.g., used by the PRotein SC_analysis and ONo TOrsion commands)
- MXBOND - maximum distance for two atoms to be considered bonded (e.g., used by the BFactor SMooth command)
- MXNONB - maximum distance for non-bonded interactions (two atoms are involved in a non-bonded interaction if their distance lies in the range <MXBOND,MXNONB]); used by BFactor BOnded
- MXCYSS - maximum Cys-SG...SG-Cys distance for disulfide links (e.g., used by the SPlit command to generate DISUlfide patches for your X-PLOR GENERATE input file)
- ISEED - special (integer) number used to initialise the random number generator (positive number: initialise with that number, i.e. reproducibly; zero or negative number: initialise with the current value of MCLOCK(), the system clock); used by the XYz RAndom_rotation and PErturb commands
- YALCUT, YBECUT and YLHCUT - parameters for the YASSPA algorithm that assigns the secondary structure of the protein residues

For an up-to-date list, use the COnstants LIst command. To revert to the "factory defaults", use the COnstants REset command.

Note that the values you enter are NOT checked at all. So if you want to set the maximum distance for connected CA atoms to -3.14 A, you may do so.


13 GENERAL, I/O AND BOOK-KEEPING COMMANDS


13.1 ? - list commands

Syntax: ? [command]
command = name of any command which has sub-commands (e.g., XYz, ONo, etc.)

Typing a single question mark will provide you with a list of all available commands. If you supply an argument "?", you will only see the general commands. If the argument is the name of a command which has sub-commands (such as SElect, XYz, PRotein, etc.), you will only get the list of those commands.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > ? ?

MOLEMAN2 commands :

? [command] (list (sub-)commands) ! (comment) QUit $ shell_command @ macro_file BOok_keeping COnstants REset COnstants SEt name value COnstants LIst STatistics GEometry_selected LIst_selected [which] MUlti_geometry which LS_plane

REad file [format] [hydro] WRite file [format] [which] APpend file [format] [hydro] SPlit file_prefix DElete_molecule

Commands with sub-commands: SElect BFactor OCcupancy CHain PDb PRotein XYz ONo DIstance SQuence AUto To see sub-commands, use for instance: ? xy ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > ? sele

SElect All SElect NOne SElect HYdrogen SElect EXhydrogen SElect OR what which SElect ANd what which SElect NEgate SElect NUmeric and_or what lo hi SElect ? ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.2 & - (re-)define a symbol or list current symbols

This command can be used to manipulate symbols. These are probably only useful for advanced users who want to write fancier macros. The command can be used in three ways:
(1) & ? -> lists currently defined symbols
(2) & symbol value -> sets "SYMBOL" to "value"
(3) & symbol -> prompts the user to supply a value for "SYMBOL" (even if the program is executing a macro)

A few symbols are predefined:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > & ?
 Nr of defined symbols : (       2)
 Symbol START_TIME : (Fri May 17 19:55:43 1996)
 Symbol USERNAME : (gerard)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

The symbol mechanism is fairly simplistic and has some limitations:
- max length of a symbol name is 20 characters
- max length of a symbol value is 256 characters
- max number of symbols is 100
- symbols can not be deleted, but they can be redefined
- symbol values are accessed by supplying $SYMBOL_NAME as an argument on the command line; the line that you type on the terminal (or in a macro) is parsed once; if there are additional parameters which the program prompts you for, you cannot use symbols for those
- only one substitution per argument (e.g., "$file1 $file2" will lead to a substituion of the entire argument by the value of symbol FILE1 only !)
- command names (first argument on any command line) cannot be replaced by a symbol (e.g.: "$command $arg1 $arg2" is not valid)
- symbols may be equated to each other, e.g. "& file2 $file1" will give FILE2 the same value as FILE1
- symbol substitution is not recursive (e.g., if you set the value of FILE2 to be "$file1", any reference to $FILE2 will be replaced by "$file1", not by the value of FILE1
- symbols on comment lines (starting with "!") are not expanded
- symbols on system command lines (starting with "$") are not expanded

Example of the use of symbols:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
! Which rotation angle convention ?
! Enter CE for CCP4 Euler angles, or CP for CCP4 Polar angles:
& conv
!
! Rotation angle 1 ?
& alpha
!
! Rotation angle 2 ?
& beta
!
! Rotation angle 3 ?
& gamma
!
! Applying rotation function solution
xyz rotate $conv $alpha $beta $gamma
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


13.3 ECho - toggle command-line echo on/off

If you run the program with scripts, it is sometimes useful to see input commands echoed. The parameter to the ECho command may be ON, OFf, or ? (to list the echo status).


13.4 #


Command history. Possible uses (blank spaces are optional):
- # ? => list history of commands
- # ! => ditto, but without numbers (handy for copying into macros)
- # ON => switch command history on
- # OFf => switch command history off
- # # => repeat previous command
- # 14 => repeat command number 14 from the list
- # 0 => repeat previous command
- # -1 => repeat penultimate command, etc.
- # 7 more => repeat command number 7, but add "more" to it (e.g., if command 7 was "$ ls" you could type "#7 -FartCos" to get "$ ls -FartCos")


13.5 STatistics - general statistics

This command calculates and prints some statistics for the currently selected set of atoms.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > st
 Nr of atoms    : (       2813)
 Nr of residues : (        396)

Nr of amino acid residues : ( 370) Nr of nucleic acids : ( 0) Nr of waters : ( 24) Nr of metals : ( 0) Nr of inorganics : ( 0) Nr of carbohydrates : ( 0) Nr of organic compounds : ( 0) Nr of other compounds : ( 2)

Nr of selected atoms : ( 2813) Ditto, hydrogen : ( 0) Ditto, ANISOU : ( 0)

Item Average St.Dev Min Max RMS Harm.ave. ---- ------- ------ --- --- --- --------- X-coord 0.000 14.956 -30.527 36.812 Y-coord 0.000 10.308 -24.280 25.566 Z-coord 0.000 8.683 -20.405 21.471 B-factor 28.669 28.761 2.000 218.470 40.610 8.843 Warning - there are B-factors > 100 A**2 ! Occpncy 1.000 0.000 1.000 1.000 1.000 1.000

The radius of gyration is 20.1 A

Range of X, Y, and Z coordinates: 67.3 A * 49.8 A * 41.9 A If you have used XYz ALign_inertia_axes, these numbers give you an indication of the dimensions of the selected molecule (or set of atoms). ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.6 BOok_keeping - do some book-keeping checks

This command will re-determine where residues start and end, if there are duplicate atoms or missing atoms or extra atoms in residues that occur in the library, or if there are residues not in the library, and if all residue names are unique.

The program also checks if there are any atoms which have X ~ Y ~ Z; often, this is an indication of unset or unknown coordinates (e.g., new atoms in O will be placed at (1500,1500,1500), sometimes waters end up at (0,0,0) etc.) which may lead to all sorts of trouble later on (e.g., in map extension around the molecule or the generation of NCSRel-cards by XPAND). The tolerance used is 0.01 A.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > bo

Total nr of residues : ( 239) Nr of amino acid residues : ( 137) Nr of nucleic acids : ( 0) Nr of waters : ( 101) Nr of metals : ( 0) Nr of inorganics : ( 0) Nr of carbohydrates : ( 0) Nr of organic compounds : ( 1) Nr of other compounds : ( 0)

Checking for missing/extra atoms ... ERROR --- Unknown atom in structure ATOM 1091 OXT GLU 137 26.188 -39.912 33.302 1.00 22.23 1CBS1313 ERROR --- Unknown atom in structure ATOM 1115 O HOH 300 15.524 -31.764 26.116 1.00 17.43 1CBS1337 ERROR --- Missing atom in structure RESIDUE HOH 300 1CBS Atom name : ( O1) ... ERROR --- Missing atom in structure RESIDUE HOH 399 1CBS Atom name : ( O1)

Checking uniqueness of residue names ... Non-unique residue names : # 224 = HOH 385 1CBS <-> # 230 = HOH 385 1CBS

Checking "special" positions (X~Y~Z) ... ATOM 1200 O HOH 385 0.000 0.000 0.000 1.00 32.12 1CBS1422 ATOM 1200 O HOH 385 1500.0001500.0001500.000 1.00 32.12 1CBS1422 ATOM 1210 O HOH 395 -36.656 -36.656 -36.656 1.00 35.59 1CBS1432 WARNING - Nr of atoms with X~Y~Z : ( 3) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.7 LIst_selected - list selected atoms or residues

Syntax: LIst_selected [which]
which = Residues | Atoms

You may either list all selected atoms, or the first selected atom (if any) of all residues.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se nu and bf 50 9999
 Select Numeric : ( AND B-factor 50.00000 9999.000)
 Selection history : (NON-HYDROGEN | AND B-factor 50.00000 9999.000 |)
 Nr of selected atoms : (          8)
 MOLEMAN2 > li r
 List first selected atom of every residue
 ATOM   807  CE  LYS   101       4.479  34.527  19.248  1.00 51.31      1CBS1029
 ATOM   819  CD  GLU   103      -0.383  25.503  13.395  1.00 50.23      1CBS1041
 ATOM  1193  O   HOH   378       9.543  16.072  11.145  1.00 50.91      1CBS1415
 ATOM  1194  O   HOH   379       8.174  14.289  20.240  1.00 54.21      1CBS1416
 ATOM  1196  O   HOH   381       5.486  15.385  24.922  1.00 50.19      1CBS1418
 Nr of residues listed : (          5)
 MOLEMAN2 > li a
 List all selected atoms
 ATOM   807  CE  LYS   101       4.479  34.527  19.248  1.00 51.31      1CBS1029
 ATOM   808  NZ  LYS   101       4.917  33.952  20.559  1.00 51.14      1CBS1030
 ATOM   819  CD  GLU   103      -0.383  25.503  13.395  1.00 50.23      1CBS1041
 ATOM   820  OE1 GLU   103      -0.130  26.346  12.499  1.00 53.12      1CBS1042
 ATOM   821  OE2 GLU   103      -1.464  25.500  14.036  1.00 52.16      1CBS1043
 ATOM  1193  O   HOH   378       9.543  16.072  11.145  1.00 50.91      1CBS1415
 ATOM  1194  O   HOH   379       8.174  14.289  20.240  1.00 54.21      1CBS1416
 ATOM  1196  O   HOH   381       5.486  15.385  24.922  1.00 50.19      1CBS1418
 Nr of atoms listed : (          8)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


13.8 LS_plane - calculate least-squares plane through selected atoms

This command calculates the equation of the least-squares plane through the currently selected atoms (unit weights), and the RMSD of the atoms to the plane.
If you want to display the plane in O, use the ONo LS_plane_odl command instead. Also, if you want to see the distance of the individual atoms to the plane, use the ONo LS_plane command.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se and res rea
 AND atom selection
 With atoms for which : (RES)
 Equals : (REA)
 Selection history : (ALL | AND REsidu = REA |)
 Nr of selected atoms : (         22)
 MOLEMAN2 > ls
 Nr of selected atoms : (         22)
 Centre of Gravity : (  22.065   26.283   20.209)
 Eigen value 1 =        515.0 Vector :   0.145303 -0.597192  0.788828
 Eigen value 2 =         31.0 Vector :   0.847841  0.486099  0.211834
 Eigen value 3 =          6.7 Vector :  -0.509954  0.638020  0.576955
 Determinant : (   1.000)
 Eigenvector #3 defines the least-squares plane
 Equation:  -0.509954 X +   0.638020 Y +   0.576955 Z =  17.176491
 RMSD to plane : (   0.552)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Note that the three eigenvalues tell you something about the shape of the selected set of atoms: in the example above (an all-trans- retinoic acid molecule), it is clearly very long, not very wide, and fairly planar (eigenvalue 1 >> 2 > 3).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 ...
 Selection history : (ALL | AND REsidu = TRP | AND CLass = Side | AND
  Residue_nr 109 109 |)
 Nr of selected atoms : (         10)
 MOLEMAN2 > ls
 Nr of selected atoms : (         10)
 Centre of Gravity : (  13.617   20.280   29.635)
 Eigen value 1 =         28.3 Vector :   0.044093  0.093234  0.994667
 Eigen value 2 =         12.4 Vector :  -0.373900  0.924815 -0.070112
 Eigen value 3 =          0.0 Vector :   0.926420  0.368814 -0.075638
 Determinant : (  -1.000)
 ERROR --- Negative determinant; change hand of inertia axes
 Eigen value 1 =         28.3 Vector :  -0.044093 -0.093234 -0.994667
 Eigen value 2 =         12.4 Vector :   0.373900 -0.924815  0.070112
 Eigen value 3 =          0.0 Vector :  -0.926420 -0.368814  0.075638
 Determinant : (   1.000)
 Eigenvector #3 defines the least-squares plane
 Equation:  -0.926420 X +  -0.368814 Y +   0.075638 Z = -17.853298
 RMSD to plane : (   0.029)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


13.9 MUlti_geometry - residue geometry statistics for multiple copies

Syntax: MUlti_geometry which
which = residue_type (must be defined in the library)

This command will list statistics (number of observations, average, standard deviation, minimum and maximum value) for the following for all currently selected copies of a user-specified residue type:
- bonded distances
- bond angles and 1-3 distances
- torsion angles and 1-4 distances

Note that the (torsion) angles used to be averaged in degrees, and that periodicity is not taken into account ! This means that dihedrals which are around +/-180 degrees may give seemingly large ranges !!! This has been changed in version 1.1, so that the average of -178 and +178 is now (+ or -) 180, rather than zero.

This option may be useful when creating "ideal geometry" dictionaries for a refinement program when you have several examples of the residue. In addition it can be used to look for large outliers.

The program constants LARGEB and LARGEA are used to flag bond distances and angles if they show a large range (i.e., maximum minus minimum value exceeds LARGEB for bonds or LARGEA for angles).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > mu gly
 Multiple copy geometry for : (GLY)
 Nr of atoms : (          4)
 Atoms : (  N  CA  C  O)
 Looking for selected residues ...
 RESIDUE  GLY A  22  1CEL
 RESIDUE  GLY A  23  1CEL
 ...
 RESIDUE  GLY B 424  1CEL
 RESIDUE  GLY B 427  1CEL
 ERROR --- Too many copies
 Maximum : (        100)
 Nr of copies found : (        100)
 Nr of bonds : (          3)
 Bond distance range large if > (   0.050)
 Bond angle    range large if > (   5.000)

Bonded distances with cut-off : 2.000 A ========================================== N - CA # 100 Ave, Sdv, Min, Max 1.451 0.004 1.440 1.464 CA - C # 100 Ave, Sdv, Min, Max 1.517 0.006 1.501 1.529 C - O # 100 Ave, Sdv, Min, Max 1.232 0.004 1.223 1.244

Angles and 1-3 angle distances ============================== N - CA - C Angle : 100 113.89 4.23 101.49 122.24 Large range 1-3 Dist : 100 2.486 0.059 2.307 2.588 CA - C - O Angle : 100 120.44 0.96 116.96 122.29 Large range 1-3 Dist : 100 2.389 0.015 2.339 2.415

Dihedrals and 1-4 torsion distances =================================== N - CA - C - O Dihedral : 100 5.27 114.42 -178.61 179.19 1-4 Dist : 100 3.150 0.432 2.608 3.695 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > mu glc
 Multiple copy geometry for : (GLC)
 Nr of atoms : (         12)
 Atoms : (  C1  C2  C3  C4  C5  C6  O1  O2  O3  O4  O5  O6)
 Looking for selected residues ...
 RESIDUE  GLC     1
 RESIDUE  GLC     2
 RESIDUE  GLC     3
 RESIDUE  GLC     4
 RESIDUE  GLC     5
 RESIDUE  GLC     6
 RESIDUE  GLC     7
 RESIDUE  GLC     8
 Nr of copies found : (          8)
 Nr of bonds : (         11)
 Bond distance range large if > (   0.050)
 Bond angle    range large if > (   5.000)

Bonded distances with cut-off : 2.000 A ========================================== C1 - C2 # 8 Ave, Sdv, Min, Max 1.533 0.013 1.513 1.548 C1 - O5 # 8 Ave, Sdv, Min, Max 1.435 0.014 1.414 1.456 C2 - C3 # 8 Ave, Sdv, Min, Max 1.519 0.013 1.503 1.543 C2 - O2 # 8 Ave, Sdv, Min, Max 1.409 0.016 1.385 1.426 C3 - C4 # 8 Ave, Sdv, Min, Max 2.528 0.413 1.497 2.875 Large range C3 - O3 # 8 Ave, Sdv, Min, Max 1.416 0.008 1.411 1.436 C4 - C5 # 8 Ave, Sdv, Min, Max 2.537 0.390 1.544 2.792 Large range C4 - O4 # 1 Ave, Sdv, Min, Max 1.413 0.000 1.413 1.413 ... C6 - O6 # 8 Ave, Sdv, Min, Max 1.410 0.016 1.374 1.426 Large range

Angles and 1-3 angle distances ============================== C2 - C1 - O5 Angle : 8 109.93 2.36 106.56 113.02 Large range 1-3 Dist : 8 2.430 0.027 2.383 2.463 C1 - C2 - C3 Angle : 8 111.72 1.48 110.48 114.24 1-3 Dist : 8 2.526 0.013 2.512 2.546 ... C1 - O5 - C5 Angle : 8 117.02 1.53 115.16 119.50 1-3 Dist : 8 2.454 0.045 2.414 2.529

Dihedrals and 1-4 torsion distances =================================== O5 - C1 - C2 - C3 Dihedral : 8 51.28 7.74 39.17 60.96 1-4 Dist : 8 2.853 0.019 2.839 2.902 O5 - C1 - C2 - O2 Dihedral : 8 37.03 167.50 -179.25 174.97 1-4 Dist : 8 3.662 0.036 3.592 3.694 ... O4 - C4 - C5 - O5 Dihedral : 1 -172.14 0.00 -172.14 -172.14 1-4 Dist : 1 3.718 0.000 3.718 3.718 ... C6 - C5 - O5 - C1 Dihedral : 8 -89.61 149.13 -176.72 168.72 1-4 Dist : 8 3.760 0.052 3.691 3.839 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.10 GEometry_selected - list geometry of selected atoms

This command will list the following for all currently selected atoms:
- bonded distances
- bond angles and 1-3 distances
- torsion angles and 1-4 distances

Make sure to use the SElect commands to isolate only those atoms that you are interested in ! Use the LIst_selected command before using this command to see exactly how many (and which) atoms/residues are selected.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se an re rea
 AND atom selection
 With atoms for which : (RE)
 Equals : (REA)
 Selection history : (ALL | AND REsidu = REA |)
 Nr of selected atoms : (         22)
 MOLEMAN2 > ge
 Nr of selected atoms : (         22)

Bonded distances with cut-off : 2.000 A ========================================== C1 [B 200 ] - C2 [B 200 ] = 1.546 A C1 [B 200 ] - C6 [B 200 ] = 1.564 A ... C15 [B 200 ] - O2 [B 200 ] = 1.251 A Nr of bonded distances : ( 22)

Angles and 1-3 angle distances ============================== C2 [B 200 ] - C1 [B 200 ] - C6 [B 200 ] = 109.691 deg = 2.543 A C2 [B 200 ] - C1 [B 200 ] - C16 [B 200 ] = 108.110 deg = 2.494 A ... O1 [B 200 ] - C15 [B 200 ] - O2 [B 200 ] = 121.547 deg = 2.183 A Nr of angles : ( 30)

Dihedrals and 1-4 torsion distances =================================== C6 [B 200 ] - C1 [B 200 ] - C2 [B 200 ] - C3 [B 200 ] = -42.758 deg = 2.902 A C16 [B 200 ] - C1 [B 200 ] - C2 [B 200 ] - C3 [B 200 ] = -162.723 deg = 3.847 A ... C13 [B 200 ] - C14 [B 200 ] - C15 [B 200 ] - O2 [B 200 ] = 129.005 deg = 3.505 A Nr of dihedrals : ( 32) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se and ch m
 AND atom selection
 With atoms for which : (CH)
 Equals : (M)
 Selection history : (ALL | AND CHain = M |)
 Nr of selected atoms : (         22)
 MOLEMAN2 > ge
 Nr of selected atoms : (         22)

Bonded distances with cut-off : 2.000 A ========================================== C1 [M 602 ] - O1 [M 602 ] = 1.410 A C1 [M 602 ] - C2 [M 602 ] = 1.515 A ... O5 [M 603 ] - C5 [M 603 ] - C6 [M 603 ] - O6 [M 603 ] = -51.669 deg = 2.729 A Nr of dihedrals : ( 43) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se all
 ...
 MOLEMAN2 > se and chain a
 ...
 MOLEMAN2 > se num and res 85 87
 ...
 MOLEMAN2 > geom
 Nr of selected atoms : (         12)

Bonded distances with cut-off : 2.000 A ========================================== CB [A 86 ] - CA [A 86 ] = 1.521 A C [A 86 ] - O [A 86 ] = 1.229 A ... CB [A 87 ] - CA [A 87 ] - C [A 87 ] - O [A 87 ] = 86.999 deg = 3.216 A Nr of dihedrals : ( 14) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.11 ! - enter a comment

Any command beginning with an exclamation mark will prompt the program to ignore that line. This may be useful to annotate scripts which run MOLEMAN2.


13.12 QUit - stop working with MOLEMAN2

This will end you current session.


13.13 $ - enter a shell command

Syntax: $ shell_command

Anything following the dollar sign is passed on to the operating system shell. This can be used to list the files in a directory, or even to run another program from within MOLEMAN2.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > $ ls *.pdb
1cbs.pdb     1cel.pdb     hydro.pdb    q.pdb        twentyz.pdb
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


13.14 COnstants REset - reset program parameters to factory defaults

This resets some of the program parameters to the values that I deemed reasonable.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > co re
 Reset program constants to defaults
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


13.15 COnstants LIst - list names and values of program parameters

This lists the names of those parameters than can be modified by the user.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > co li
 BLIMLO = B-factor default minimum     : (   2.000)
 BLIMHI = B-factor default maximum     : (  50.000)
 QLIMLO = Occupancy default minimum    : (   0.000)
 QLIMHI = Occupancy default maximum    : (   1.000)
 MXCACA = Max connected CA-CA distance : (   4.500)
 TORTOL = Torsion/improper tolerance   : (   5.000)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


13.16 COnstants SEt - set a program parameter value

Syntax: COnstants SEt name value
name = parameter name
value = new value for this parameter

This enables you to set the value of a program parameter. Use the COnstants LIst option to see what parameters can be set in the current version of the program

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > co set tortol 10
 Set TORTOL to 10.000
 MOLEMAN2 > co set junk 103
 Set JUNK to 103.000
 ERROR --- Name not recognised
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


13.17 REad - read a PDB file

Syntax: REad file [format] [hydro]
file = PDB file name
format = Pdb | Alwyn
hydro = No | Yes

The Alwyn format is only required when you try to read *very* old PDB files created with older versions of O.
By default, hydrogen atoms are *stripped* when you read a file, unless you set the hydro parameter to y(es) !
Once the file has been read, some book-keeping is done. By default, *all* atoms will be selected (including hydrogen atoms, if they were read) !

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > re 1cbs.pdb
 Reading from file : (1cbs.pdb)
 in normal PDB format
 ignoring hydrogen atoms
 ==> Found file in GKPATH : (/nfs/pdb/full/1cbs.pdb)
 HEADER :     RETINOIC-ACID TRANSPORT                 28-SEP-94   1CBS      1CBS   2
 AUTHOR :     G.J.KLEYWEGT,T.BERGFORS,T.A.JONES                             1CBS  10
 REVDAT :    1   26-JAN-95 1CBS    0                                        1CBS  11
 CRYST1 :    45.650   47.560   77.610  90.00  90.00  90.00 P 21 21 21    4  1CBS 216

>>>>> END card encountered <<<<<

Nr of lines read : ( 1459) Nr of hydrogens skipped : ( 0)

Total nr of residues : ( 238) Nr of amino acid residues : ( 137) Nr of nucleic acids : ( 0) Nr of waters : ( 100) Nr of metals : ( 0) Nr of inorganics : ( 0) Nr of carbohydrates : ( 0) Nr of organic compounds : ( 1) Nr of other compounds : ( 0)

Checking for missing/extra atoms ...

Checking "special" positions (X~Y~Z), Bs and Qs ... No suspicious coordinates encountered All atoms have B <= 100 A2 All atoms have B >= 2.0 A2 All atoms have Q <= 1.0 All atoms have Q >= 0.01 No atoms with ANISOU cards

Nr of atoms now : ( 1213) Nr of residues : ( 238) Select ALL atoms Selection history : (ALL |) Nr of selected atoms : ( 1213) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.18 APpend - append a structure to the one(s) already in memory

Syntax: APpend file [format] [hydro]

Same parameters etc. as for the REad command. Book-keeping is done again, and *all* atoms are selected.


13.19 WRite - write a PDB file

Syntax: WRite file [format] [which]
file = PDB file name
format = Pdb | Xplor | Ccp4
which = ALl | NO_hydro | SElected | PAla | PGly | PSer | CAlpha

Write (a subset of) the current atoms to a PDB file. You must provide the name of the output file. If the file already exists, an error message is generated and you are asked if you want to overwrite it ("Open file as OLD (Y/N) ?"). If you don't, you can subsequently supply a different file name.

If the format is Pdb, then all records (including any REMARKS etc.) will be written, and atoms which were on HETATM cards on input will be on HETATM cards on output. Cell constants etc. are also included.

If the format is Xplor, only ATOM records will be written (even for HETATMs), and nothing else (no CRYST1, etc.).

If the format is Ccp4, the same is written as for Xplor format, but in addition the CRYST1 etc. cards will added at the top of the file.

You can specify which atoms you want to write out:
- ALl writes all atoms currently in memory
- NO_hydro writes all non-hydrogen atoms
- SElected writes only the currently selected atoms (if any)
- PGly creates a poly-Glycine file (i.e., only main-chain atoms of protein residues and all residues will be called GLY; the program recognizes main-chain atoms by their atom names or by the fact that they have been labelled as such in the library file)
- PAla generates a poly-Alanine file in a similar fashion
- PSer generates a poly-Serine file (in which atoms CG, OG, SG, OG1 and CG1 will be called OG)
- CAlpha writes only the C-alpha atoms of protein residues

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > wr q.pdb p se
 Output PDB file : (q.pdb)
 Format : (Pdb)
 Atoms  : (SE)
 Number of atoms to write : (       3220)
 Nr of atoms written : (       3220)
 Nr of lines written : (       3705)

CPU total/user/sys : 1.0 1.0 0.0 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > wr q.pdb ccp pser
 Output PDB file : (q.pdb)
 Format : (CCP)
 Atoms  : (PSER)
 ERROR --- XOPXNA - error # 126 while opening NEW file : q.pdb
 OPEN : (UNIT= 10 STATUS=NEW CAR_CONTROL=LIST FORM=FORMATTED
  ACCESS=SEQUENTIAL)
 Error : (Connection timed out)
 Open file as OLD (Y/N) ? (N) y
 Number of atoms to write : (       4952)
 Nr of atoms written : (       4952)
 Nr of lines written : (       4952)

CPU total/user/sys : 1.5 1.5 0.0 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.20 SPlit - split a PDB file into one for each chain/segment

Syntax: SPlit file_prefix

This option is primarily intended for X-PLOR users. As such, it removes much of the dread previously associated with the preparation of files for the GENERATE step.

You provide the prefix for the PDB file names (e.g., /Usr/Billy/Xplor/m1 or ../../xplor/m5tom6/m5 or ./m1). Upper and lower case characters in the prefix will be maintained. For each segment id found in the set of *ALL* atoms, a separate PDB file is written which contains all atoms which have that segment id (even if they are separated in the structure). The name of this file will be your prefix plus an underscore (_) plus the segment id (converted to lowercase and without any spaces) plus the extension ".pdb". For instance, if your prefix is "../Xplor/m1", then segment "PROA" will be written to a file called "../Xplor/m1_proa.pdb".

In addition to all these PDB files, an X-PLOR input file is created for the GENERATE stage. It will need some editing if your structure contains things other than protein and water (i.e.: insert topology and parameter file names; insert any patches that are necessary). The program will also look for any disulfide links and create DISU patch statements for each of these (if any). Note that in the case of for instance iron-sulfur clusters you may have to remove (some of) the DISU patch statements !

IMPORTANT: the segment id *ALONE* is used to identify different segments. In other words: the chain names are ignored (this is in agreement with the current PDB convention in which segment identifiers designate separate entities; a chain may contain multiple segments, e.g. protein + glycosylation + ligand + water).

NOTE: make sure that the segment ids make sense !! If you just read in a PDB file from O or downloaded one from Brookhaven, you probably have to use one or more of the CHain commands (e.g., CHain ASk or CHain AUto) to get correct segmentation. You have yourself to blame for any segmentation faults ;-)

Since the program writes all atoms with a certain segment id to the same file (even if they are interspersed with other segments), you can use the CHain commands to "merge" two previously distinct segments into one. For instance, if you have two-fold NCS, you may initially have refined only waters which obey the NCS and used separate segments for them (e.g., WATA and WATB). When you start to add waters which are not conserved in both molecules, you could create a third segment but this tends to get very confusing as you add and delete waters while you rebuild. Instead, you may want to merge all waters. You can do this by simply giving all of them the same segment id with the CHain commands.

NOTE: from version 0.21 onward, the program tries its best to make sure that OT1 and OT2 are named correctly. Also, if an OT2 atom is missing it will be generated on the fly and written to the PDB file (but it will not be added to your structure in memory !).

NOTE: from version 1.1.1 onward, the default is to delete hydrogen atoms and atoms with unknown coordinates.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > re test.pdb
 ...
 MOLEMAN2 > ch au
 ...
 MOLEMAN2 > split qqq
 Split PDB files for X-PLOR
 File prefix : (qqq)
 X-PLOR generate input file : (qqq_generate.inp)

File nr 1 Segment |AAAA| File name : qqq_aaaa.pdb ATOM 1 CB SER A 4 51.238 19.799 7.294 1.00 34.72 AAAA Found & written OT1 ... Adding OT2 at : ( 39.097 17.619 13.212) Nr of lines written : ( 446) Nr of atoms written : ( 443)

File nr 2 Segment |AAAB| File name : qqq_aaab.pdb ATOM 483 N ASN B 66 26.962 19.074 20.708 1.00 2.00 AAAB Found & written OT1 Found & written OT2 Nr of lines written : ( 3747) Nr of atoms written : ( 3744)

File nr 3 Segment |AAAC| File name : qqq_aaac.pdb ATOM 4227 CB THR C 1 -3.115 5.872 13.814 1.00 77.65 AAAC Found & written OT1 ... Adding OT2 at : ( 1.912 19.586 15.804) Nr of lines written : ( 467) Nr of atoms written : ( 464)

File nr 4 Segment |AAAD| File name : qqq_aaad.pdb ATOM 4691 O1 HOH D 701 -6.674 -2.991 11.360 1.00 15.26 AAAD Nr of lines written : ( 39) Nr of atoms written : ( 36)

File nr 5 Segment |AAAE| File name : qqq_aaae.pdb ATOM 4727 ZN+2 ZNC E 901 -15.390 36.554 20.084 1.00 19.82 AAAE Nr of lines written : ( 5) Nr of atoms written : ( 2)

File nr 6 Segment |AAAF| File name : qqq_aaaf.pdb ATOM 4729 C1 NAG F 990 39.707 11.948 13.548 1.00 77.90 AAAF Nr of lines written : ( 17) Nr of atoms written : ( 14)

Nr of PDB files generated : ( 6)

Looking for disulfides ... Looking for CYS- SG atoms ... ATOM 496 SG CYS B 67 26.853 23.275 24.788 1.00 18.16 AAAB ATOM 719 SG CYS B 94 27.958 21.920 25.822 1.00 2.00 AAAB ATOM 1765 SG CYS B 231 17.706 42.764 23.358 1.00 2.00 AAAB ATOM 1944 SG CYS B 254 15.900 18.643 43.219 1.00 14.44 AAAB ATOM 2032 SG CYS B 265 17.033 17.112 43.932 1.00 13.67 AAAB ATOM 3118 SG CYS B 402 11.202 50.463 20.218 1.00 17.06 AAAB ATOM 4105 SG CYS B 521 12.171 51.649 18.881 1.00 2.00 AAAB ATOM 4247 SG CYS C 3 -0.622 9.196 15.268 1.00 34.29 AAAC ATOM 4355 SG CYS C 17 1.031 6.187 17.911 1.00 53.34 AAAC ATOM 4388 SG CYS C 22 -1.328 8.466 17.020 1.00 31.20 AAAC ATOM 4529 SG CYS C 39 2.164 5.235 19.293 1.00 20.78 AAAC ATOM 4539 SG CYS C 41 -5.604 11.002 25.701 1.00 52.29 AAAC ATOM 4620 SG CYS C 52 -4.745 12.199 24.311 1.00 29.91 AAAC ATOM 4626 SG CYS C 53 -4.229 13.247 16.868 1.00 42.61 AAAC ATOM 4669 SG CYS C 59 -4.012 11.998 15.283 1.00 12.80 AAAC Nr of CYS SG atoms : ( 15) Max SG-SG distance for link : ( 2.200) Disulfide # 1 67 AAAB <-> 94 AAAB @ 2.03 A Disulfide # 2 254 AAAB <-> 265 AAAB @ 2.03 A Disulfide # 3 402 AAAB <-> 521 AAAB @ 2.03 A Disulfide # 4 3 AAAC <-> 22 AAAC @ 2.03 A Disulfide # 5 17 AAAC <-> 39 AAAC @ 2.02 A Disulfide # 6 41 AAAC <-> 52 AAAC @ 2.03 A Disulfide # 7 53 AAAC <-> 59 AAAC @ 2.03 A Nr of disulfides : ( 7)

X-PLOR generate input file written CPU total/user/sys : 1.5 1.4 0.1 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

The PDB files may look as follows:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
REMARK Created by MOLEMAN2 V. 960227/0.8 at Tue Feb 27 15:29:35 1996 for user gerard
REMARK File name : m1_aaaa.pdb
ATOM      1  CB  ALA    86       3.109  42.928  53.312  1.00 28.54      AAAA
ATOM      2  C   ALA    86       4.129  45.032  52.428  1.00 27.90      AAAA
 ...
ATOM   3319  OT1 LEU   447      38.267  47.693  63.602  1.00 18.65      AAAA
ATOM   3320  OT2 LEU   447      39.049  46.653  65.371  1.00 20.12      AAAA
REMARK Nr of atoms in file : 2740
END
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

The GENERATE input file may look as follows:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 remarks File m1_generate.inp - generate pdb/psf file
 remarks Created by MOLEMAN2 V. 960227/0.8 at Tue Feb 27 15:29:35 1996 for user gerard

topology @tophcsdx.pro @toph19.sol { add more topology files here } end

parameter @parhcsdx.pro @param19.sol { add more parameter files here }

nbonds atom cdie shift eps=8.0 e14fac=0.4 cutnb=7.5 ctonnb=6.0 ctofnb=6.5 nbxmod=5 vswitch wmin=0.5 end { dielectric constant set to 8.0 (EPS) } { close contacts printed only if dist < 0.5 A (WMIN) } end

{ protein } segment name="AAAA" chain @toph19.pep coordinates @m1_aaaa.pdb end end vector do (name="CD1") ( name CD and resname ile ) coordinates @m1_aaaa.pdb

{ CARB } segment name="BBBB" chain coordinates @m1_bbbb.pdb end end coordinates @m1_bbbb.pdb

{ HETE } segment name="CCCC" chain coordinates @m1_cccc.pdb end end coordinates @m1_cccc.pdb

{ protein } segment name="KKKK" chain @toph19.pep coordinates @m1_kkkk.pdb end end vector do (name="CD1") ( name CD and resname ile ) coordinates @m1_kkkk.pdb

{ CARB } segment name="LLLL" chain coordinates @m1_llll.pdb end end coordinates @m1_llll.pdb

{ HETE } segment name="MMMM" chain coordinates @m1_mmmm.pdb end end coordinates @m1_mmmm.pdb

{ META } segment name="XXXX" chain coordinates @m1_xxxx.pdb end end coordinates @m1_xxxx.pdb

{ WATE } segment name="WWWW" chain coordinates @m1_wwww.pdb end end coordinates @m1_wwww.pdb

{ the disulfides } patch DISU refer=1=(segid="AAAA" and resid 176) refer=2=(segid="AAAA" and resid 235) end

patch DISU refer=1=(segid="AAAA" and resid 368) refer=2=(segid="AAAA" and resid 415) end

patch DISU refer=1=(segid="KKKK" and resid 176) refer=2=(segid="KKKK" and resid 235) end

patch DISU refer=1=(segid="KKKK" and resid 368) refer=2=(segid="KKKK" and resid 415) end

flags exclude vdw elec end

hbuild selection=(hydrogen and not known) phistep=45 end

{ optimise hydrogens to get rid of clashes } constraints fix=( not hydrogen ) end flags include vdw elec end minimize powell nstep=50 end constraints fix=( not all ) end

write coordinates output=m1_gen.pdb end

write structure output=m1.psf end

stop ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


13.21 DELETE_molecule - remove all atoms from memory

This command (of which the first SIX letters should be typed - to prevent accidental deletion) removes all atoms from memory.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > dele
 ERROR --- Invalid command
 ==> (dele)
 MOLEMAN2 > delete
 ALL ATOMS AND RESIDUES DELETED !!!
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14 PDB COMMANDS


14.1 PDb HEtero - manipulate ATOM/HETATM flags

You have the following options with this command:
- Atom_all; this will change all HETATMs to ATOMs (useful if you want to use a model from the PDB for molecular replacement or refinement with a program that chokes on or ignores HETATM cards);
- Deduce; the program will set all protein and nucleic acid atoms to ATOM, and all others to HETATM (useful when you deposit your model coordinates with the PDB).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pd he a
 All atoms set to type ATOM
 MOLEMAN2 > pd he d
 Deducing ATOM/HETATM types ...
 Nr set to ATOM   : (       4952)
 Nr set to HETATM : (          0)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.2 PDb NO_atom_numbers - remove O-style atom numbers

O likes to write the chemical element number of each atom/hetatm, but the PDB does not like them. Use this command to remove them.


14.3 PDb CRystal - set crystal parameters for CRYST1 etc. cards

Supply the six cell constants, the number of molecules in the unit cell (using the PDB definition), and the spacegroup symbol (using the PDB definition, i.e. with spaces). The cell constants *must* be known for CCP4-type PDB files !

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pd cr 84 84 111.8 90 90 90 8 "P 21 21 2"
 Unit-cell axes (A)      : (  84.000   84.000  111.800)
 Unit-cell angles (deg)  : (  90.000   90.000   90.000)
 Unit-cell volume (A3)   : (  7.889E+05)
 Nr of molecules in cell : (       8)
 Spacegroup symbol       : (P 21 21 2)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb crys 41.2 41.2 196.5 90 90 90
 Nr of molecules in cell ? (       8)
 Spacegroup symbol ? (P 21 21 2) P 41 21 2
 Unit-cell axes (A)      : (  41.200   41.200  196.500)
 Unit-cell angles (deg)  : (  90.000   90.000   90.000)
 Unit-cell volume (A3)   : (  3.335E+05)
 Nr of molecules in cell : (       8)
 Spacegroup symbol       : (P 41 21 2)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.4 PDb SSbond - manipulate SSBOND records

Syntax: PDb SSbond what
what = List | Delete | Generate

Use the COnstants SEt command to change the cut-off S-S link distance if necessary.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb ssb lis
 List SSBOND records
 SSBOND   1 CYS A    4    CYS A   72                                     1CEL 340
 SSBOND   2 CYS A   19    CYS A   25                                     1CEL 341
 SSBOND   3 CYS A   50    CYS A   71                                     1CEL 342
 SSBOND   4 CYS A   61    CYS A   67                                     1CEL 343
 ...
 SSBOND  20 CYS B  261    CYS B  331                                     1CEL 359
 Nr of SSBOND records listed : (         20)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb ssb del
 Delete SSBOND records
 Nr of SSBOND records deleted : (         20)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb ssb gen
 Generate SSBOND records
 Looking for CYS- SG  atoms ...
 ATOM    25  SG  CYS A   4      36.618  67.989  34.228  1.00 21.90      1CEL 398
 ...
 ATOM  5990  SG  CYS B 331      -7.298   9.647  20.243  1.00 10.03      1CEL6363
 ATOM  6483  SG  CYS B 397     -25.330  24.877  13.503  1.00 10.62      1CEL6856
 Max SG-SG distance for link : (   2.200)
 SSBOND   1 CYS A    4    CYS A   72     S-S =   2.03 A
 SSBOND   2 CYS A   19    CYS A   25     S-S =   2.02 A
 ...
 SSBOND  19 CYS B  238    CYS B  243     S-S =   2.03 A
 SSBOND  20 CYS B  261    CYS B  331     S-S =   2.03 A
 Nr of SSBOND records generated : (         20)
 MOLEMAN2 > pdb ssb l
 List SSBOND records
 SSBOND   1 CYS A    4    CYS A   72     S-S =   2.03 A
 SSBOND   2 CYS A   19    CYS A   25     S-S =   2.02 A
 SSBOND   3 CYS A   50    CYS A   71     S-S =   2.02 A
 ...
 SSBOND  19 CYS B  238    CYS B  243     S-S =   2.03 A
 SSBOND  20 CYS B  261    CYS B  331     S-S =   2.03 A
 Nr of SSBOND records listed : (         20)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.5 PDb REmark - add a REMARK record

Syntax: PDb REmark text
text = text of the remark

Add information about the molecule(s) as a REMARK record (e.g., model name, current R-factor and Rfree, etc.).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb re "Model M3 @ 960303 R=0.231 Rfree=0.273"
 Add REMARK record : (Model M3 @ 960303 R=0.231 Rfree=0.273)
     4: REMARK Model M3 @ 960303 R=0.231 Rfree=0.273
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.6 PDb LIst_remark - list all REMARK records

Syntax: PDb LIst_remark

List the current set of REMARK records (if any).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb li
 List REMARK records
 Nr of REMARK records : (          4)
     1: REMARK FILENAME="/nfs/gerard/gerard/proteins/cbh2/ibgx/xplor/m1_final_2.pdb"
     2: REMARK Uses *NO* sigma or amplitude cut-offs
     3: REMARK DATE:15-Jan-96  23:31:00       created by user: gerard
     4: REMARK Model M3 @ 960303 R=0.231 Rfree=0.273
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.7 PDb DElete_remark - delete any or all REMARK records

Syntax: PDb DElete_remark which
which = number of the REMARK record (* means ALL)

Delete one or all REMARK records (if any).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb del 3
 Delete REMARK record nr: (          3)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.8 PDb NAme - change name of selected atoms or residues

Syntax: PDb NAme which old new
which = Atom | Residue
old = old name to replace
new = new name

Change the atom or residue name of selected atoms. A residue is considered to be selected if at least one of its atoms is selected.

This command can be used to rename all waters (e.g., from residue HOH with atom O1 to residue WAT and atom OW). You can also write macros, for instance to convert residue and atom names for your favourite ligand or cofactor from/to X-PLOR or CCP4 or TNT or PDB or ... conventions.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > sel and type water
 ...
 MOLEMAN2 > pdb name res hoh wat
 Replace residue name |HOH| by |WAT|
 Nr of residues changed : (        100)
 MOLEMAN2 > pdb name atom " o1 " " ow "
 Replace atom name | O1 | by | OW |
 Nr of atom names changed : (        100)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

The following macro would convert X-PLOR waters into PDB waters:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 select all
 select and type water
 pdb name atom " o1 " " o  "
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.9 PDb CHemical+charge - try to deduce chemical element and charge

In the new PDB format definition, column 77-78 of each ATOM and HETATM record contains the symbol of the chemical element type of the atom and column 79-80 the charge (don't ask me why the charge has to fit inside two columns ...). This command tries to deduce this information as best as it can, using information in the atom name (4 characters) field.

Chemical element: the following four methods are tried in turn:
(1) is it one of the special hydrogen names ?
(2) are the first two characters of the atom name a valid element symbol ?
(3) is a space plus the second character a valid element symbol ?
(4) is a space plus the first character a valid element symbol ?
(5) if none of the above, generate an error message, and use "??" as the element symbol

Charge: the following are valid charge definitions for columns 3 and 4 of the atom name, which will be recognised by the program (default charge otherwise is " 0"):
(1) "++" or "--"
(2) "+ " or " +" or "- " or " -"
(3) "n+" or "+n" or "n-" or "-n", where n is 0,1,2,...,9

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb che
 Deriving chemical name and charge ...
 Nr of atoms processed    : (         10)
 Unknown chemical element : (          0)
 Nr of positive atoms     : (          7)
 Nr of negative atoms     : (          3)
 Nr uncharged or unknown  : (          0)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Example: the following bogus PDB file:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
HETATM 7039 CA++  CA K 440      40.443  57.977  83.935  1.00 16.86
HETATM 7039 CA2+  CA K 441      40.443  57.977  83.935  1.00 16.86
HETATM 7039 CA+2  CA K 442      40.443  57.977  83.935  1.00 16.86
HETATM 7039  S--  CA K 443      40.443  57.977  83.935  1.00 16.86
HETATM 7039  S-2  CA K 444      40.443  57.977  83.935  1.00 16.86
HETATM 7039  S2-  CA K 445      40.443  57.977  83.935  1.00 16.86
HETATM 7039 AH1+  CA K 446      40.443  57.977  83.935  1.00 16.86
HETATM 7039  H+   CA K 447      40.443  57.977  83.935  1.00 16.86
HETATM 7039 'H1+  CA K 448      40.443  57.977  83.935  1.00 16.86
HETATM 7039 HX +  CA K 449      40.443  57.977  83.935  1.00 16.86
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

gives the following result:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
HETATM 7039 CA++  CA A 440      40.443  57.977  83.935  1.00 16.86      AAAACA+2
HETATM 7039 CA2+  CA A 441      40.443  57.977  83.935  1.00 16.86      AAAACA+2
HETATM 7039 CA+2  CA A 442      40.443  57.977  83.935  1.00 16.86      AAAACA+2
HETATM 7039  S--  CA A 443      40.443  57.977  83.935  1.00 16.86      AAAA S-2
HETATM 7039  S-2  CA A 444      40.443  57.977  83.935  1.00 16.86      AAAA S-2
HETATM 7039  S2-  CA A 445      40.443  57.977  83.935  1.00 16.86      AAAA S-2
HETATM 7039 AH1+  CA A 446      40.443  57.977  83.935  1.00 16.86      AAAA H+1
HETATM 7039  H+   CA A 447      40.443  57.977  83.935  1.00 16.86      AAAA H+1
HETATM 7039 'H1+  CA A 448      40.443  57.977  83.935  1.00 16.86      AAAA H+1
HETATM 7039 HX +  CA A 449      40.443  57.977  83.935  1.00 16.86      AAAA H+1
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


14.10 PDb SAnity_check - sanity checks w.r.t. alternate locations and occupancies

Syntax: PDb SAnity_check

This command performs a number of sanity checks on your PDB file w.r.t. alternate locations and occupancies. Since alternate locations are often generated by manual editing of a PDB file, mistakes are easily made. This command will check for the following for each atom in your structure:

- ERROR - ALTERNATE LOCATION WITH BLANK LABEL
- ERROR - ALTERNATE LOCATIONS WITH IDENTICAL LABELS
- ERROR - ALTERNATE LOCATIONS IN IDENTICAL POSITIONS

- WARNING - OCCUPANCIES DO NOT SUM TO 1.00
- WARNING - SINGLE LOCATION WITH NON-BLANK LABEL
- WARNING - MORE THAN 3 ALTERNATE LOCATIONS
- WARNING - ALTERNATE LOCATIONS IN ALMOST IDENTICAL POSITIONS

- NOTE - LABELS NOT IN ALPHABETICAL ORDER
- NOTE - ALTERNATE LOCATIONS IN SIMILAR POSITIONS

Note: the BOok_keep command provides additional sanity-related info, and the DIstance LIst command can be used to print a list of atoms that lie too close to each other in space (e.g., DIst LIst 0 1).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > pdb san

WARNING - OCCUPANCIES DO NOT SUM TO 1.00 for the following atom, the occupancies of the 1 alternate locations add up to 0.50 instead of 1.00 (this can be okay if you are sure that the atom has partial occupancy, or if it lies in a special position, such as on a twofold axis): ATOM 845 CB GLN A 53 74.460 37.771 14.755 0.50 7.15 C [...] ERROR - ALTERNATE LOCATIONS IN IDENTICAL POSITIONS the following two atoms have alternate locations but their distance is only 0.03 A, suggesting that they are in identical positions in space and should be merged into a single location: ATOM 1245 CA ALYS A 78 73.331 23.163 -7.825 0.50 12.88 C ATOM 1246 CA BLYS A 78 73.311 23.144 -7.841 0.50 13.01 C

WARNING - ALTERNATE LOCATIONS IN ALMOST IDENTICAL POSITIONS the following two atoms have alternate locations but their distance is only 0.10 A, suggesting that they are in almost identical positions in space and could be merged into a single location: ATOM 1249 CB ALYS A 78 73.945 24.503 -7.388 0.50 12.63 C ATOM 1250 CB BLYS A 78 73.881 24.510 -7.464 0.50 12.88 C

NOTE - ALTERNATE LOCATIONS IN SIMILAR POSITIONS the following two atoms have alternate locations and their distance is 0.32 A, suggesting that they are in similar positions in space and could perhaps be merged into a single location: ATOM 1255 CG ALYS A 78 74.081 25.533 -8.510 0.50 12.41 C ATOM 1256 CG BLYS A 78 74.263 25.347 -8.691 0.50 13.05 C [...]

WARNING - OCCUPANCIES DO NOT SUM TO 1.00 for the following atom, the occupancies of the 1 alternate locations add up to 0.50 instead of 1.00 (this can be okay if you are sure that the atom has partial occupancy, or if it lies in a special position, such as on a twofold axis): ATOM 7694 O AHOH S 64 70.295 19.056 5.753 0.50 2.05 O

WARNING - SINGLE LOCATION WITH NON-BLANK LABEL the following atom has only one location, but its alternate location label is "A" instead of blank (this can be okay in some cases, e.g. for waters that only interact with one of a set of alternative conformations of an amino acid residue): ATOM 7694 O AHOH S 64 70.295 19.056 5.753 0.50 2.05 O [...] ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


14.11 PDb NUmber - renumber selected residues

Syntax: PDb NUmber first_new

Renumber the selected residues, starting from the given number.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se and typ water
 AND atom selection
 With atoms for which : (TYP)
 Equals : (WATER)
 Selection history : (ALL | AND TYpe = WATE |)
 Nr of selected atoms : (        100)
 MOLEMAN2 > pdb numb 501
 Renumber selected residues starting at : (        501)
 Nr of last changed residue : (        600)
 Nr of residues changed : (        100)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


15 SELECTION COMMANDS

These commands enable you to define a subset of atoms for which one or more operations have to be carried out. You can use this, for example, to get statistics about the temperature factors of all protein main-chain atoms in segment ABC3, or to set the occupancy of all calcium atoms to 0.5.

Some selection commands reset the selections (ALl, NOne, HYdrogen, and EXhydrogen). Others make it narrower (ANd), wider (OR) or invert the current selection (NEgate).

Examples:
- to select the non-hydrogen atoms of all segments except QQQQ, use: SElect HYdrogen, SElect OR SEgid QQQQ, SElect NEgate
- to select all non-hydrogen main-chain protein atoms of chain A, use: SElect EXhydro, SElect ANd TYpe PROT, SElect ANd CLass Main, SElect ANd CHain A
- to select all non-hydrogen protein atoms, use: SElect EXhydro, SElect ANd TYpe PROT
- to select all non-hydrogen protein and nucleic acid atoms, use: SElect EXhydro, SElect ANd TYpe PROT, SElect OR TYpe NUCL

The program keeps a record of the SElection commands executed since the last time a resetting command was carried out. For instance, after selecting all non-hydrogen main-chain protein atoms, you will see:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se ?
 Selection history : (NON-HYDROGEN | AND TYpe = PROT | AND CLass = MAI |)
 Nr of selected atoms : (        548)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


15.1 SElect ? - list number and type of currently selected atoms.

The number of selected atoms is shown with a history of the recently executed SElect commands.


15.2 SElect ALl - select all atoms

This selects all atoms, including any hydrogen atoms.


15.3 SElect NOne - select none of the atoms

This de-selects all atoms.


15.4 SElect NEgate - invert current selection

All atoms which had previously been selected will be de-selected, and the other way around.


15.5 SElect HYdrogen - select all hydrogen atoms

No non-hydrogen atoms will be selected.


15.6 SElect EXhydrogen - select all non-hydrogen atoms

No hydrogen atoms will be selected.


15.7 SElect BY_residue - select entire residue if at least one atom selected

If at least one atom of a residue has been selected, all atoms of the residue will be selected.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se and res rea
 ...
 Nr of selected atoms : (         22)
 MOLEMAN2 > se dist 0 3.5
 ...
 Nr of selected atoms : (         29)
 MOLEMAN2 > se by
 Select by residue
 Selection history : (ALL | AND REsidu = REA | DIstance  0.00  3.50 |
  BY_residue |)
 Nr of selected atoms : (         55)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


15.8 SElect DIst - select atoms a certain distance from the current selection

This can be used to select, e.g. a ligand, and subsequently all atoms within 3.5 or 4 Å from this compound.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se and res rea
 ...
 Nr of selected atoms : (         22)
 MOLEMAN2 > se dist 0 3.5
 ...
 Nr of selected atoms : (         29)
 MOLEMAN2 > se by
 Select by residue
 Selection history : (ALL | AND REsidu = REA | DIstance  0.00  3.50 |
  BY_residue |)
 Nr of selected atoms : (         55)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


15.9 SElect ANd - make a selection narrower

Syntax: SElect ANd what which
what = CHain | SEgid | TYpe | CLass | REsidue | ATom
which = which chain, segid, etc. to use type can prot, nucl, wate, etc. class can be main or side chain residue can be Ala, HOH, BGL etc. atom can be " CA ", " O1 " etc.

This can be used to select by chain name (one character), segment name (4 characters), type (any of PROT, NUCL, WATE, META, INOR, CARB, ORGA or HETE; 4 characters), or class (main or side chain, one character). Alternatively, you can select by residue type (Ala, Asn, etc.) or by atom name (enclose in "double quotes"). Note that atom names are 4 characters (e.g., use " CA " to select C-alpha atoms).

The new selection will be those atoms that were already selected AND satisfy the new criterion. This usually reduces the selection.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se ex
 Select NON-HYDROGEN atoms
 Selection history : (NON-HYDROGEN |)
 Nr of selected atoms : (       1213)
 MOLEMAN2 > se an ty prot
 AND atom selection
 With atoms for which : (TY)
 Equals : (PROT)
 Selection history : (NON-HYDROGEN | AND TYpe = PROT |)
 Nr of selected atoms : (       1091)
 MOLEMAN2 > se and cl main
 AND atom selection
 With atoms for which : (CL)
 Equals : (MAIN)
 Selection history : (NON-HYDROGEN | AND TYpe = PROT | AND CLass = MAIN |)
 Nr of selected atoms : (        548)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se ex
 Select NON-HYDROGEN atoms
 Selection history : (NON-HYDROGEN |)
 Nr of selected atoms : (       5794)
 MOLEMAN2 > se an resi trp
 AND atom selection
 With atoms for which : (RESI)
 Equals : (TRP)
 Selection history : (NON-HYDROGEN | AND REsidu = TRP |)
 Nr of selected atoms : (        280)
 MOLEMAN2 > se and at " CA "
 AND atom selection
 With atoms for which : (AT)
 Equals : ( CA)
 Selection history : (NON-HYDROGEN | AND REsidu = TRP | AND ATom = CA |)
 Nr of selected atoms : (         20)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


15.10 SElect BUtnot - make a selection narrowes

Syntax: SElect BUtnot what which
what = CHain | SEgid | TYpe | CLass | REsidue | ATom
which = which chain, segid, etc. to use type can prot, nucl, wate, etc. class can be main or side chain residue can be Ala, HOH, BGL etc. atom can be " CA ", " O1 " etc.

Analogous to SElect ANd and SElect OR.

The new selection will be those atoms that were already selected AND do NOT satisfy the new criterion. This usually narrows the selection.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > sel but type prot
 BUtnot atom selection
 With atoms for which : (TYPE)
 Equals : (PROT)
 Selection history : (ALL | BUTNOT TYpe = PROT |)
 Nr of selected atoms : (        906)
  ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


15.11 SElect OR - make a selection wider

Syntax: SElect OR what which
what = CHain | SEgid | TYpe | CLass | REsidue | ATom
which = which chain, segid, etc. to use type can prot, nucl, wate, etc. class can be main or side chain residue can be Ala, HOH, BGL etc. atom can be " CA ", " O1 " etc.

This can be used to select by chain name (one character), segment name (4 characters), type (any of PROT, NUCL, WATE, META, INOR, CARB, ORGA or HETE; 4 characters), or class (main or side chain, one character). Note that atom names are 4 characters (e.g., use " CA " to select C-alpha atoms).

The new selection will be those atoms that were already selected OR satisfy the new criterion. This usually increases the selection.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se ex
 Select NON-HYDROGEN atoms
 Selection history : (NON-HYDROGEN |)
 Nr of selected atoms : (       1213)
 MOLEMAN2 > se and type prot
 AND atom selection
 With atoms for which : (TYPE)
 Equals : (PROT)
 Selection history : (NON-HYDROGEN | AND TYpe = PROT |)
 Nr of selected atoms : (       1091)
 MOLEMAN2 > se or type water
 OR atom selection
 With atoms for which : (TYPE)
 Equals : (WATER)
 Selection history : (NON-HYDROGEN | AND TYpe = PROT | OR TYpe = WATER |)
 Nr of selected atoms : (       1191)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


15.12 SElect NUmeric - select using numerical properties

Syntax: SElect NUmeric and_or what lo hi
and_or = And | Or | Butnot
what = Residue_nr | B-factor | Occupancy | X-coord | Y-coord | Z-coord | Mass | Element | Covalent_bond_radius
lo = minimum value to select
hi = maximum value to select

Make a selection wider or narrow using numerical properties. This can be used to select a zone of residues (e.g., SEl NUm And Res 23 38), all atoms with high B-factors (SEl NUm And Bfac 50 9999), non-zero occupancy (SEl NUm And Occ 0.0 0.9999), etc.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > sel num and res 50 50
 Select Numeric : ( AND Residue_nr 50 50)
 Selection history : (ALL | AND Residue_nr 50 50 |)
 Nr of selected atoms : (         11)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > sel all
 Select ALL atoms
 Selection history : (ALL |)
 Nr of selected atoms : (       7038)
 MOLEMAN2 > sel num and mass 40 999
 Select Numeric : ( AND Mass 40.00000 999.0000)
 Selection history : (ALL | AND Mass 40.00000 999.0000 |)
 Nr of selected atoms : (          3)
 MOLEMAN2 > lis
 List first selected atom of every residue
 ATOM  3244  I   IBZ A 436      19.199  55.708  63.367  1.00 38.03      1CEL3617
 ATOM  6763  I   IBZ B 436     -19.688  12.428   7.328  1.00 32.65      1CEL7136
 ATOM  7039 CA    CA   440      40.443  57.977  83.935  1.00 16.86      1CEL7412
 Nr of residues listed : (          3)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > sel all
 ...
 MOLEMAN2 > se num and bfac 50 9999
 Select Numeric : ( AND B-factor 50.00000 9999.000)
 Selection history : (ALL | AND B-factor 50.00000 9999.000 |)
 Nr of selected atoms : (          8)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


16 B-FACTOR AND OCCUPANCY COMMANDS


16.1 BFactor STats - statistics concerning B-factors of selected atoms

Syntax: BFactor STats [how]
how = Chain | Type

Lists statistics for the B-factors of all selected atoms, either split up by chain or by type. For example, to get B-factor statistics for all non-hydrogen atoms by chain, use:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > sel ex
 Select NON-HYDROGEN atoms
 Selection history : (NON-HYDROGEN |)
 Nr of selected atoms : (       7038)
 MOLEMAN2 > bf st c
 Chain name      Atoms       Bave       Bsdv       Bmin       Bmax
   A<->1CEL       3518      18.70       8.58       5.31      66.29
   B<->1CEL       3518      17.23       8.68       5.91      67.75
    <->1CEL          2      24.27       7.41      16.86      31.68
 Nr of chains encountered  : (          3)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Ditto, but by type:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > bf st ty
       Type      Atoms       Bave       Bsdv       Bmin       Bmax
       PROT       6440      17.14       8.06       5.31      67.75
       WATE        529      27.65       9.99       7.05      55.30
       CARB         50      20.91       5.42      12.52      34.04
       HETE         19      21.67       6.61       8.66      38.03
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

To get information for only the main-chain protein atoms of chain A, use:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > sel ex
 MOLEMAN2 > se and ty prot
 MOLEMAN2 > se and ch a
 MOLEMAN2 > sel an cl mai
 AND atom selection
 With atoms for which : (CL)
 Equals : (MAI)
 Selection history : (NON-HYDROGEN | AND TYpe = PROT | AND CHain = A | AND
  CLass = MAI |)
 Nr of selected atoms : (       1736)
 MOLEMAN2 > bf st ty
       Type      Atoms       Bave       Bsdv       Bmin       Bmax
       PROT       1736      17.02       6.61       6.97      53.21
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Don't forget to *explicitly* exclude any hydrogen atoms you may have !


16.2 OCcupancy STats - statistics concerning occupancies of selected atoms

Syntax: OCcupancy STats [how]
how = Chain | Type

Lists statistics for the occupancies of all selected atoms, either split up by chain or by type. See the BFactor STats command for examples.

Don't forget to *explicitly* exclude any hydrogen atoms !


16.3 BFactor NO_anisou - delete all ANISOUs (if any)

This command will remove all ANISOUs (if any are present; otherwise nothing will happen).


16.4 BFactor LImit - set or limit B-factors of selected atoms

Syntax: BFactor LImit lo hi

Reset all B-factors less than "lo" to a value of "lo", and all those greater than "hi" to a value of "hi". This can be used to reset very low and/or very high B-factors, or to reset all temperature factors.
To reset very low B-factors: use a very high number for "hi" (e.g., 1000).
To reset very high B-factors: use a negative number for "lo" (e.g., -1).
To reset all B-factors, make "lo" equal to "hi".

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > se ex
 Select NON-HYDROGEN atoms
 Selection history : (NON-HYDROGEN |)
 Nr of selected atoms : (       7038)
 MOLEMAN2 > bf st c
 Chain name      Atoms       Bave       Bsdv       Bmin       Bmax
   A<->1CEL       3518      18.70       8.58       5.31      66.29
   B<->1CEL       3518      17.23       8.68       5.91      67.75
    <->1CEL          2      24.27       7.41      16.86      31.68
 Nr of chains encountered  : (          3)
 MOLEMAN2 > bf li 10 50
 Reset B-factors to lie in range    10.00 to    50.00
 MOLEMAN2 > bf st c
 Chain name      Atoms       Bave       Bsdv       Bmin       Bmax
   A<->1CEL       3518      18.72       8.39      10.00      50.00
   B<->1CEL       3518      17.32       8.44      10.00      50.00
    <->1CEL          2      24.27       7.41      16.86      31.68
 Nr of chains encountered  : (          3)
 MOLEMAN2 > bf li 18 18
 Reset B-factors to lie in range    18.00 to    18.00
 MOLEMAN2 > bf st c
 Chain name      Atoms       Bave       Bsdv       Bmin       Bmax
   A<->1CEL       3518      18.00       0.00      18.00      18.00
   B<->1CEL       3518      18.00       0.00      18.00      18.00
    <->1CEL          2      18.00       0.00      18.00      18.00
 Nr of chains encountered  : (          3)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


16.5 OCcupancy LImit - set or limit occupancies of selected atoms

Syntax: OCcupancy LImit lo hi

Reset all occupancies less than "lo" to a value of "lo", and all those greater than "hi" to a value of "hi". This can be used to reset very low and/or very high occupancies, or to reset all occupancies.
See the BFactor LImit command for further discussion and examples.


16.6 BFactor PRod_plus - multiply/add B-factors of selected atoms

Syntax: BFactor PRod_plus [prod] [plus]
prod = multiply selected Bs by this number (default 1.0)
plus = add this number to the selected Bs (default 0.0)

This can be used to change the scale of temperature factors (e.g., prior to molecular replacement with a room-temperature search model using new cryo data).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > stat
 ...
       Item        Average         St.Dev            Min            Max
       ----        -------         ------            ---            ---
 ...
   B-factor         16.622          9.305          3.920         54.210
 ...
 MOLEMAN2 > bf pr 0.6 2
 New B =     0.6000 * Old B +     2.0000
 Nr of atoms updated : (       1213)
 MOLEMAN2 > st
 ...
   B-factor         11.973          5.583          4.352         34.526
 ...
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


16.7 OCcupancy PRod_plus - multiply/add occupancies of selected atoms

Syntax: OCcupancy PRod_plus [prod] [plus]
prod = multiply selected occupancies by this number (default 1.0)
plus = add this number to the selected occupancies (default 0.0)

This can be used to change the occupancies of the selected atoms.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > oc pr 0.5
 New Q =     0.5000 * Old Q +     0.0000
 Nr of atoms updated : (       1213)
 ----- EXAMPLE ----