- version 0.1 @ 951113 - GJ Kleywegt
- version 0.2 @ 951115 - GJ Kleywegt
- version 0.3 @ 951116 - GJ Kleywegt
- version 0.4 @ 951117 - GJ Kleywegt
- version 0.5 @ 951118 - GJ Kleywegt
- version 0.6 @ 951123 - GJ Kleywegt
- version 0.7 @ 951203 - GJ Kleywegt
- version 0.8 @ 960118 - GJ Kleywegt
This tutorial and the associated files are (c) G.J. Kleywegt (Uppsala), 1996. Permission is granted to reproduce these materials in reasonable quantities for personal or educational use.
If you use the tutorial and like it, send a postcard from your hometown with a nice stamp to the author at: Department of Molecular Biology, Biomedical Centre, University of Uppsala, Box 590, S-751 24 Uppsala, SWEDEN.
This is a tutorial to introduce protein-model rebuilding and quality
control using O, and refinement using X-PLOR. It requires the following:
- basic familiarity with O (e.g., through the "O for Morons" tutorial)
- O version 5.10.x and an O manual
- a few files provided with this tutorial
- access to the O public domain directory (generically called OMAC; check
the correct name on your local system; if not present, get it from the
O ftp server, file "pub/gerard/extras/omac/omac.tar.gz")
- the list of Frequently-Asked Questions (file "OMAC/software.faq"; also
available through the WorldWide Web)
- some Uppsala utility programs (also available from the O ftp server;
you will also need the manuals for these programs)
- X-PLOR (version 3.1 or 4) and the X-PLOR manual
- CCP4 programs (not necessary; they are used for map calculations,
but these can also be carried out with X-PLOR)
The tutorial can be used in different ways:
- as a quick introduction to quality control and rebuilding:
chapters 2, 3, 4 and 5 (optionally, 8 and 9);
- as a quick introduction to refinement (only for absolute X-PLOR
beginners): chapters 6 and 7;
- as a complete course in rebuilding and refinement from first
to final model: all chapters (including several iterations
of chapter 9).
Chapters 5, 7 and 9 will generally be the most time-consuming (depending
on the level of experience the student has with O and rebuilding and
refinement in general). Chapters 2 to 8 contain a few questions in
sections called "The Swedish Inquisition". Generally, these will
aid or extend the understanding of the subject matter covered in that
chapter.
In addition, ask your system manager to install the "run", "ono" and "oplot" scripts (from the server, directory pub/gerard/extras/scripts); these will make your O-life a tad simpler.
The tutorial does not include ab initio model-building in an MIR map. For this purpose, there are some macros on the O ftp server (directory pub/p2_course).
Also, the structure used as an example has no NCS, so averaging is not covered either. There is, however, a separate RAVE tutorial available from the O ftp server (file "pub/gerard/rave/exam.tar.gz").
The structure you will be working with is that of cellular retinoic-acid
binding protein type II (CRABP II) in complex with all-trans-retinoic
acid. This structure was solved at 1.8 Å, but we will only use data
to 2.8 Å initially to show the effects that limited resolution may have.
Our starting model is derived from a related protein, CRABPI, which was
solved at 2.9 Å resolution. This structure was changed as follows:
- the N and C terminal residues were removed
- the region near the insertion in CRABPII was removed (114-117)
- a loop with poor density and high temperature factors was
removed (100-106)
- residues which differ between CRABPI and II were cut back to alanines
(unless they are glycines in CRABPI)
The temperature factors of this model were reset, and it was subjected to mild Simulated Annealing refinement etc. I then "rebuilt" it to introduce some deliberate errors and did some more energy minimisation and individual temperature-factor refinement. After this, a 2Fo-Fc and an Fo-Fc map were calculated.
This structure has the advantage that it is fairly small and yet has most of the common errors and problems (main chain, side chain, poor loop, insertion site, low resolution) associated with protein model refinement and rebuilding.
Before you start, copy your local version of the gmrp directory tree to your own area. This contains all the files you need. You will start working in the O directory, gmrp/o.
For teaching exercises: to answer the questions, it may be useful to have a copy of the O manual and tutorial at hand. The answers to most questions can be found in the literature references, program manuals, or by inspection (of the model, the density or some file).
It may be handy to use a web-browser to consult the O manual and FAQ (http://imsb.au.dk/~mok/o/). Manuals for the utility programs are available in HTML format as well (in Uppsala, via our homepage; elsewhere from the O ftp server, file "pub/gerard/extras/html_manuals/html_manuals.dirtar.gz"). Documentation for X-PLOR and CCP4 programs is also available on the WorldWide Web.
While you work through the tutorial, you may also want to use the Good Model-building and Refinement forms for keeping notes (available as file "OMAC/gsp_forms.ps").
Rebuilding is a lot more fun if you play loud music on your personal stereo !
Comments and suggestions about this tutorial can be E-mailed to gerard@xray.bmc.uu.se.
The following references are essential for working through the rebuilding parts of this tutorial.
(1) T.A. Jones & M. Kjeldgaard, "O - The Manual", version 5.10.3, Uppsala, 1995.
(2) T.A. Jones, J.Y. Zou, S.W. Cowan & M. Kjeldgaard, "Improved methods for building protein models in electron density maps and the location of errors in these models", Acta Cryst. A47, 110-119 (1991).
(3) G.J. Kleywegt & T.A. Jones, "Good model-building and refinement practice", to be published (Methods in Enzymology, 1996).
(4) G.J. Kleywegt, "O for Morons", Uppsala (1994).
(5) G.J. Kleywegt, T. Bergfors, H. Senn, P. Le Motte, B. Gsell, K. Shudo & T.A. Jones, "Crystal structures of cellular retinoic acid binding proteins I and II in complex with all-trans-retinoic acid and a synthetic retinoid", Structure 2, 1241-1258 (1994).
(1) G.J. Kleywegt & T.A. Jones, "OOPS-a-daisy", ESF/CCP4 Newsletter 30, June 1994, pp. 20-24.
(2) G.J. Kleywegt, "Dictionaries for Heteros", ESF/CCP4 Newsletter 31 (32?), June 1995, pp. 45-50.
(3) G.J. Kleywegt & T.A. Jones, "Efficient rebuilding of protein structures", Acta Cryst. D, to be published (1996).
(4) G.J. Kleywegt & T.A. Jones, "xdlMAPMAN and xdlDATAMAN - programs for reformatting, analysis and manipulation of biomacromolecular electron-density maps and reflection datasets", Acta Cryst. D, accepted for publication (1996).
(5) T.A. Jones & M. Kjeldgaard, "???", to be published (Methods in Enzymology, 1996).
(6) G.J. Kleywegt, "Use of non-crystallographic symmetry in protein structure refinement", Acta Cryst. D, accepted for publication (1996).
(7) A.T. Brünger, "X-PLOR - A System for Crystallography and NMR", Yale University, New Haven (1992). [And references therein.]
(8) T.A. Jones & S. Thirup, "Using known substructures in protein model building and crystallography", EMBO J. 5, 819-822 (1986).
(9) C.I. Brändén & T.A. Jones, "Between objectivity and subjectivity", Nature 343, 687-689 (1990).
(10) T.A. Jones & M. Kjeldgaard, "Making the first trace with O", in "From first map to final model" (S. Bailey, R. Hubbard & D. Waller, Eds.), SERC Daresbury Laboratory, pp. 1-13 (1994).
(11) G.J. Kleywegt & T.A. Jones, "Where freedom is given, liberties are taken", Structure 3, 535-540 (1995).
(12) G.J. Kleywegt & T.A. Jones, "Braille for pugilists", in "Making the most of your model" (W.N. Hunter, J.M. Thornton & S. Bailey, Eds.), SERC Daresbury Laboratory, pp. 11-24 (1995).
(12) E.J. Dodson, G.J. Kleywegt & K.S. Wilson, "Report of a workshop on the use of statistical validators in protein X-ray crystallography", Acta Cryst. D52, 228-234 (1996).
(13) A.T. Brünger, "Free R value: a novel statistical quantity for assessing the accuracy of crystal structures", Nature 355, 472-475 (1992).
(14) A.T. Brünger & L.M. Rice, "Crystallographic refinement by simulated annealing: methods and applications", to be published (Methods in Enzymology, 1996).
(15) A.T. Brünger, "The free R value: a more objective statistic for crystallography", to be published (Methods in Enzymology, 1996).
(16) Collaborative Computational Project, Number 4, "The CCP4 suite: programs for protein crystallography", Acta Cryst. D50, 760-763 (1994).
(17) A. Hodel, S.H. Kim & A.T. Brünger, "Model bias in macromolecular crystal structures", Acta Cryst. A48, 851-858 (1992).
(18) M.W. MacArthur, R.A. Laskowski & J.M. Thornton, "Knowledge-based validation of protein structures derived by X-ray crystallography and NMR spectroscopy", Curr. Opin. Struct. Biol. 4, 731-737 (1994).
(19) R.J. Read, "Model bias and phase combination", in "From first map to final model" (S. Bailey, R. Hubbard & D. Waller, Eds.), SERC Daresbury Laboratory, pp. 31-40 (1994).
(20) J.Y. Zou & S.L. Mowbray, "An evaluation of the use of databases in protein structure refinement", Acta Cryst. D50, 237-249 (1994).
In the gmrp/o directory you will find five files:
- "m1.pdb" = your starting model
- "m1_2fofc.map" = the 2Fo-Fc map in O format
- "m1_fofc.map" = the Fo-Fc map in O format
- "maps" = an O macro to draw these maps around the current screen centre
- "symmy" = an O macro that creates symmetry objects around the current
screen centre
There is also a directory called gmrp/o/gerard, but we will ignore this for the time being.
Let's start by creating a new O database:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- % 143 gerard rigel 20:40:18 gmrp/o > ono... Run 4d_ono ... Linked /home/gerard/bindkey.macro to this directory ... Executed bindkey.macro for you
... Link to odat directory not found ... Making a soft link to the odat directory for you
... Link to omac directory not found ... Making a soft link to the omac directory for you
... Executing /nfs/taj/alwyn/o/bin/4d_ono ... For gerard on rigel at Mon Nov 13 20:49:41 MET 1995
O > Use of this program implies acceptance of conditions O > described in Appendix 1 of the O manual O > O version 5.10.3, Apr 1995 O > Define an O file (terminate with blank): O > Menu names are not defined. O > Enter file name [/nfs/taj/alwyn/o/data/menu.o]: O > menu.o file for O version 5.10 O > Last modified 20-Jul-94 O > Startup file was never loaded O > Enter file name [/nfs/taj/alwyn/o/data/startup.o]: O > startup.o file for O version 5.10 O > Last modified 28-Sep-94 O > Startup file was never loaded O > Enter file name [/nfs/taj/alwyn/o/data/access.o]: Chasis id= 1762011761 O > File_display_connectivity is not defined. O > Enter file name [/nfs/taj/alwyn/o/data/all.dat]: O > Maximum inter-residue link distance = 2.00 O > There were 23 residues. O > 175 atoms. O > Do you want to use the display? [Yes]: O > Graphics board GL4DXG-5.2 O > Making visibility data structures. O > Making visibility data structures. O > O > Trackball on (F7KEY) ioctl: Invalid argument setraw: Invalid argument save As1> File_O_save is not defined. As1> Enter file name [binary.o]: gmrp.o ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Execute the macro "OMAC/newo.omac". This will set up a number of things (including the O menu) which are handy for rebuilding:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > @omac/newo.omac O > Macro in computer file-system. Heap> @all_on_off, @all_on, @all_off ... O > As4> Save your O database now O > As4> ... O > O > O > O > save ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Read in the starting model and draw it. Also set up the symmetry.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > s_a_i m1.pdb m1 Sam> File type is PDB Sam> Database compressed. Sam> Space for 142650 atoms Sam> Space for 10000 residues Sam> Molecule M1 contained 123 residues and 907 atoms Sam> Centre of gravity updated for 1 123 O > mol m1 zo ; end O > Current molecule has not been loaded. O > ce_zo m1 a10 c130 As4> M1 A10 C130 M1 As4> Centering on zone from A10 to C130 O > symm_set Sym> Molecule name? [M1]: Sym> Define cell constants [ 45.65 47.56 77.61 90.00 90.00 90.00]: Sym> Name of spacegroup? [P212121]: ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(1) Change the "maps" macro such that it contours the maps within
15 Å from the screen centre.
(2) Change the "symmy" macro such that it uses the same radius as
that used in the "maps" macro.
(3) What does the odat directory contain ? Where are the symmetry
operator files for the most common spacegroups stored for O ?
(4) What does the script "OMAC/ofaq" do ? Use it (or your web-browser)
to find out what you can about the Pep_flip command.
Before we start rebuilding, let's see what the good, bad and ugly bits of the starting model are. To this end, use the Pep_flip, RSC_fit and RS_fit commands in O.
The Pep_flip command calculates how (un)usual the orientation of the peptide oxygen atoms is compared to the database. Values greater than 2-2.5 Å need to be checked critically.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > pep_flip m1 a2 c135 Util> M1 A2 C135 M1 Util> Calculating zone A2 to C135 in molecule M1 , object M1 Util> The DB is now being loaded. Util> Loading data for protein:HCAC ... Util> Loading data for protein:TLN_3 Util> 15 fragments used for residue A4 pep_flip value= 2.47 Util> 20 fragments used for residue A5 pep_flip value= 3.28 Util> 20 fragments used for residue A6 pep_flip value= 0.46 ... Util> 20 fragments used for residue C133 pep_flip value= 0.63 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The RSC_fit command calculates the RMSD with the rotamer that is most similar to your sidechain conformation. Values greater than 1.5 Å (or even 1.0 Å for leucines and iso-leucines) need to be checked.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > rsc_fit m1 a2 c135 Util> M1 A2 C135 M1 Util> The Rotamer_DB is now being loaded. Util> Calculating zone A2 to C135 in molecule M1 , object M1 Util> Best rotamer for A2 is No. 2 with rms 0.594 Util> Best rotamer for A3 is No. 1 with rms 0.702 Util> All atoms in this residue are fixed Util> SCGLY is missing. Util> All atoms in this residue are fixed Util> Best rotamer for A7 is No. 1 with rms 3.023 Util> Best rotamer for A8 is No. 2 with rms 2.282 Util> All atoms in this residue are fixed ... Util> Best rotamer for C134 is No. 2 with rms 0.115 Util> Best rotamer for C135 is No. 1 with rms 0.294 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The RS_fit command (in this case the real-space R-factor) calculates how well your model fits the (2Fo-Fc) map.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > map_file m1_2fofc.map O > rsr_map RSR> File_rsr_map is not defined. RSR> Enter file name [rsr.map]: m1_2fofc.map RSR> Name of map file? [m1_2fofc.map]: O > read odat/rsfit_all.o O > rsr_setup RSR> Automatic scaling? [Yes]: RSR> autoscale option on RSR> Contouring of refinement box? [No]: RSR> Which metod, CONV or DIFF? [CONV]: RSR> Maximize the convolution product RSR> Real space R factor(RFAC) or Correlation coefficient(RSCC)? [RFAC]: RSR> Attempt to subtract out neighbour atom density? [Yes]: RSR> Densities will be subtracted RSR> Define number of scans [5]: RSR> Define shifts [ 0.30 0.20 0.10 0.10 0.05]: RSR> Define overall B [ 20.00]: RSR> Define wall [ 3.50]: RSR> Define C and Ao [ 1.04 0.90]: 0.95 0.85 RSR> Define integration radius [ 3]: RSR> Define scale to be applied to calculated density [ 40.76]: O > rs_fit m1 a2 c135 Util> M1 A2 C135 M1 Util> Calculating zone A2 to C125 in molecule M1 , object M1 Util> 33 atoms in zone Util> Plus value for this map is: 114 Util> 8 atoms for residue A2 R factor= 0.350 Util> 11 atoms for residue A3 R factor= 0.287 Util> 5 atoms for residue A4 R factor= 0.274 Util> 4 atoms for residue A5 R factor= 0.246 ... Util> 7 atoms for residue C134 R factor= 0.286 Util> 11 atoms for residue C135 R factor= 0.317 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Also run Yasspa to figure out which residues are in helices, strands or other regions.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > yasspa m1 alpha 0.5 Util> Template size : 5 residues. Util> There were 19 O > yasspa m1 beta 0.8 Util> Template size : 5 residues. Util> There were 63 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Click on an atom on the screen, for instance on the CZ of the arginine
residue in the centre. Check that the information line at the top
of the screen now shows:
- molecule name, residue name and type, atom name;
- Cartesian coordinates and temperature factor of the atom;
- the RSC_fit value of the residue;
- the Pep_flip value of the residue;
- the real-space R-factor of the residue;
- the secondary structure type (ALPHA, BETA or "nothing").
Now write out some datablocks for use with OOPS, and stop the program:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > dir m1_resi* Heap> M1_RESIDUE_NAME C W 123 Heap> M1_RESIDUE_TYPE C W 123 Heap> M1_RESIDUE_POINTERS I W 246 Heap> M1_RESIDUE_CG R W 492 Heap> M1_RESIDUE_PEPFLIP R W 123 Heap> M1_RESIDUE_RSC R W 123 Heap> M1_RESIDUE_RSFIT R W 123 Heap> M1_RESIDUE_2RY_STRUC C W 123 O > wr M1_RESIDUE_NAME resnam.o ; O > wr M1_RESIDUE_TYPE restyp.o ; O > wr M1_RESIDUE_PEPFLIP pepflip.o ; O > wr M1_RESIDUE_RSC rsc.o ; O > wr M1_RESIDUE_RSFIT rsrfac_all.o ; O > stop As1> Saved As1> Graphics released. ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(1) How are Pep_flip values calculated ? Why do the two N and C
terminal residues of a continuous stretch of residues have Pep_flip
values of zero ? Is this statistic related to the Ramachandran plot,
and if so: how ?
(2) How are RSC_fit values calculated ? What happens for glycine
and alanine residues ?
(3) What is the difference between the real-space R-factor and the
real-space correlation coefficient ? Why did we not use the
default values for A0 and C in RSR_setup ?
(4) Explain how Yasspa decides if a residue is in an alpha helix.
(5) If you want to create a file which contains one line per residue,
with the residue name, type, RSC_fit and Pep_flip value, how would
you go about ?
(6) Which O datablock determines what information is shown at the
top of the screen when you click on an atom ?
(7) Name two ways to add a command to the O menu.
(8) How can you can calculate RS_fit values for a subset of the atoms
in each residue, for example only for the main chain atoms ?
Now we will run OOPS. Before you start the program, create a subdirectory called oops (and read the OOPS manual if you're not familiar with the program):
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- % 144 gerard rigel 20:40:18 gmrp/o > mkdir oops % 145 gerard rigel 20:40:18 gmrp/o > run oops ... Print statistics and histograms ? (Y) Auto-generate (some) O2D plot files ? (Y)Molecule name in O ? (M1)
O data block with residue names ? (resnam.o) ... O data block with residue types ? (restyp.o) ... Nr of WATERs : ( 0)
Analyse pep-flip values ? (Y) y O data block with pep-flip values ? (pepflip.o) ... Number of values .................... 115 Average value ....................... 0.913 Standard deviation .................. 0.670 Minimum value observed .............. 0.117 Maximum value observed .............. 4.340 ... Nr >= 2.0000 and < 2.2500 : 2 ( 1.74 %; Cum 93.91 %) Nr >= 2.2500 and < 2.5000 : 3 ( 2.61 %; Cum 96.52 %) Nr >= 2.7500 and < 3.0000 : 1 ( 0.87 %; Cum 97.39 %) Nr >= 3.0000 and < 3.2500 : 1 ( 0.87 %; Cum 98.26 %) Nr >= 3.2500 and < 3.5000 : 1 ( 0.87 %; Cum 99.13 %) Nr >= 4.2500 and < 4.5000 : 1 ( 0.87 %; Cum 100.00 %) ... O2D plot file ? (m1_pepflip.plt) Plot file written
Pep-flip cut-off ? ( 2.500) 2.0 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Since this is our first rebuilding round (which means we have to visit all residues anyway), we will use rather strict criteria.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Analyse RS-fit values (all atoms) ? (Y) nAnalyse RS-fit values (main chain) ? (N) n
Analyse RS-fit values (side chain) ? (N) n
Analyse RS R-factor (all atoms) ? (N) y O data block with RS R-factors ? (rsrfac_all.o) ... Number of values .................... 123 Average value ....................... 0.287 Standard deviation .................. 0.040 Minimum value observed .............. 0.200 Maximum value observed .............. 0.429 ... O2D plot file ? (m1_rsrfac_all.plt) Plot file written
RS R-factor cut-off ? ( 0.329) 0.3 RS R-factor cut-off WATERs ? ( 0.329)
Analyse RSC values ? (Y) ... Number of values .................... 85 Average value ....................... 0.980 Standard deviation .................. 0.758 Minimum value observed .............. 0.093 Maximum value observed .............. 3.023 ... O2D plot file ? (m1_rsc.plt) Plot file written
RSC cut-off ? ( 1.500)
Analyse if mask is too tight ? (N) n
Analyse low temperature factors ? (N)
Analyse high temperature factors ? (Y)
PDB file ? (m1.pdb) CRYST1 ( 45.650 47.560 77.610 90.00 90.00 90.00 P 21 21 21 4) ... Max CA-CA distance for neighbours ? ( 4.500) ... Threshold for high Bs ? ( 21.934) 20 Threshold for high Bs WATERs ? ( 27.620) Checking high Bs ...
Analyse RMS delta-B bonded atoms ? (N)
Analyse low occupancies ? (N)
Analyse high occupancies ? (N)
Analyse phi-psi values ? (Y) Checking allowed PHI-PSI areas Nr of residues with defined PHI : ( 121) Nr of residues with defined PSI : ( 121)
Analyse peptide planarity ? (Y) ... O2D plot file ? (m1_pep_plan.plt) Plot file written
Maximum absolute deviation ? ( 5.800) 3
Analyse C-alpha chirality ? (Y) ... O2D plot file ? (m1_ca_chir.plt) Plot file written
Maximum absolute deviation ? ( 3.500)
Compare with previous model ? (N)
Analyse QualWat values ? (N)
Analyse nr of bad contacts ? (N)
User-definable criteria Max number of them : ( 10)
Enter file with user datablock (<CR> to stop): ( )
Nr of user criteria : ( 0)
You may opt to get the details listed on the screen. Do you want to see the details ? (Y) n ... Do you want to have a list file ? (Y) Name of the list file ? (m1_rebuild.notes) ... Create pseudo-PDB file ? (N) ... O command(s) to execute in every macro ? (bell print DONE) @fast_dials ... Do you want macros for ALL residues ? (N) y ... Do you want CHAINED macros ? (Y) ... Do you want macros named as RESIDUES ? (Y)
OOPS - (ASN A2) OKAY - (PHE A3) ... OKAY - (VAL C134) OOPS - (ARG C135)
Nr of macros : ( 123) Nr of baddies : ( 61)
Start by typing @oops.omac in O !!!
Bad pep-flip : ( 9) ... Bad RS R-factor (all atoms) : ( 37) Bad RSC : ( 23) ... O2D plot file ? (m1_badcounts.plt) Plot file written Writing oops_remarks.pdb ... ... ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
OOPS has created:
- macros (in sub-directory gmrp/o/oops) to take you from one residue to
the next, and to tell you what is suspicious about them;
- plot files ("m1_*.plt"); these can be converted to PostScript with
the program O2D or the script "OMAC/o2dps";
- "m1_rebuild.notes", a very useful file which you can use as your
electronic notebook file;
- "oops_remarks.pdb", containing some statistics as PDB REMARK cards;
- "oops_badcounts.o", an O datablock containing the number of violated
criteria for each residue (can be read into O and used to colour
your molecule, for instance);
- "oops.omac", the first OOPS macro you have to execute inside O; this
is the only time you need to type an O command to execute an OOPS
macro; once that is done, subsequent commands will be added to (and
later removed from) the O menu automagically.
(1) Make a CA object in O which is coloured according to the number
of violations found by OOPS.
(2) For a well-refined protein model, we expect ~1-2% of the residues
to have unusual Pep_flip values (2.5 Å cutoff), and ~5-10% to have
non-rotamer sidechain conformations (1.5 Å cutoff). What are these
numbers for your current model ? What does this tell you about the
quality of your starting model?
(3) Run ProCheck on the starting model. What conclusion(s) can you
draw from this ? Contrast this with what you know about the
quality of the model. Explain why the following phrase in a
paper about a low-resolution structure is meaningless: "According
to ProCheck, the final model has a better quality than other
structures solved at similar resolution."
In this section, we will take you to a few spots in the current model which need attention. One example of each type of problem will be dealt with here; it is up to you to apply this to the entire model. For instance, we shall only discuss one example of a residue with a completely wrong sidechain conformation, but there are many more such residues, which you have to detect and rebuild yourself.
While you rebuild, edit the file "m1_rebuild.notes" to keep track of the changes you make to the model, observations regarding as-yet unmodeled entities etc.
As you already know, the current model is fairly incomplete. Below is the correct sequence for human CRABPII:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- SEQRES 1 137 PRO ASN PHE SER GLY ASN TRP LYS ILE ILE ARG SER GLU 1CBS 202 SEQRES 2 137 ASN PHE GLU GLU LEU LEU LYS VAL LEU GLY VAL ASN VAL 1CBS 203 SEQRES 3 137 MET LEU ARG LYS ILE ALA VAL ALA ALA ALA SER LYS PRO 1CBS 204 SEQRES 4 137 ALA VAL GLU ILE LYS GLN GLU GLY ASP THR PHE TYR ILE 1CBS 205 SEQRES 5 137 LYS THR SER THR THR VAL ARG THR THR GLU ILE ASN PHE 1CBS 206 SEQRES 6 137 LYS VAL GLY GLU GLU PHE GLU GLU GLN THR VAL ASP GLY 1CBS 207 SEQRES 7 137 ARG PRO CYS LYS SER LEU VAL LYS TRP GLU SER GLU ASN 1CBS 208 SEQRES 8 137 LYS MET VAL CYS GLU GLN LYS LEU LEU LYS GLY GLU GLY 1CBS 209 SEQRES 9 137 PRO LYS THR SER TRP THR ARG GLU LEU THR ASN ASP GLY 1CBS 210 SEQRES 10 137 GLU LEU ILE LEU THR MET THR ALA ASP ASP VAL VAL CYS 1CBS 211 SEQRES 11 137 THR ARG VAL TYR VAL ARG GLU 1CBS 212 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Execute the OOPS macro for the first residue, draw the maps and generate symmetry objects.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > @oops/a2 O > Macro in computer file-system. As4> No object defined. As4> M1 A2 A2 M1 As4> Centering on zone from A2 to A2 O > As4> Residue ASN A2 O > As4> Bad RS R-factor (all atoms) = 0.350 O > As4> Too high temperature factor = 29.11 O > As4> Non-planar peptide; improper = 3.86 O > O > Macro in database. O > O > As4> Hit or type "@oops/a3" for next residue O > O > O > O > @maps O > Macro in computer file-system. As2> Symbol inserted. O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > @symmy O > Macro in computer file-system. Sym> Molecule c.g. = 17.98 21.20 27.16 Sym> Radius = 26.22 Sym> Symmop 3, Shift 0 0 0 Sym> Centre of gravity updated for 1 123 O > Macro in database. O > O > O > O > ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that there is no clear density for the missing N-terminal proline residue, but just to demonstrate how to insert a residue at the N-terminus we will go ahead anyway.
First save your current model (and all associated data) in an O-format file (containing all the datablocks of the model). If we screw up, all we have to do is to Read_form this file again.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > write m1_* m1_save.odb ; ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Now centre on the CA atom of residue A2.
Since the Mutate_insert command can only insert residues after existing
ones, we need to use a slight detour. We shall insert a residue
after residue A2, copy the coordinates of A2 to this new residue, and
then Mutate_replace A2 to proline and renumber the residues.
(1) Insert a residue after the current N-terminus of the same type as the current N-terminal residue:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > mut_ins Mut> Mutate a molecule by inserting residues. Mut> Molecule ([M1 ]) : Mut> After which residue: a2 Mut> New residue name and type (<cr> to end) : a2a asn Mut> New residue name and type (<cr> to end) : Mut> There are 1 mutations ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(2) Copy the coordinates of A2 to the new residue A2A:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > merge_atoms Sam> Merge from molecule name, and zone: m1 a2 a2 Sam> Merge to molecule name and start residue: m1 a2a Sam> Datablock containing transformation [<cr> identity]: Sam> 8 atoms Sam> 8 updated. Sam> Centre of gravity updated for 2 2 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(3) Replace the new N-terminus by the correct residue type:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > mut_repl Mut> Mutate a molecule by replacing one residue type Mut> by another. Mut> Molecule ([M1 ]) : Mut> Residue name and new type (<cr> to end) : a2 pro Mut> Residue name and new type (<cr> to end) : Mut> There are 1 mutations Mut> The Rotamer_DB is now being loaded. ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(4) Redraw the object. The N-terminal proline now has its CA atom in the same position as the asparagine.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > mol m1 zo ; end ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(5) There are several ways to get the proline in the correct position.
In this case, it's probably easiest to use Move_zone on the proline,
and then use Lego_side_ch (if necessary), Tor_residue and Refi_zone
(and perhaps RSR_rigid) to apply the finishing touch. Unfortunately,
the Lego_auto_mc command does not work for terminal residues, although
you could get around that by defining an extra "dummy" residue. In
that case you would have two more alternatives:
Alternative 1: use the Baton command to place the CA of the proline
and of the dummy CA; then use Lego_au_mc and Lego_au_sc, Move_zone, etc.
Alternative 2: use Move_atom to place the CA of the proline in the density
and to place the dummy CA;
then use the same commands as in alternative 1 to touch things up.
Hint: when you use Move_zone and double-click on an atom (e.g., the CA
atom), it will become the pivot for rotations of the residue.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > move_zone m1 a2 a2 Mnp> M1 A2 A2 M1 Mnp> Fragment pivot point: 22.088 15.961 43.470 O > O > Macro in database. O > O > O > O > Macro in database. O > O > O > O > O > O > O > Macro in database. O > O > O > O > Mnp> Coordinates updated ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
You will need four of the menu commands during the move: Dial_next, Dial_prev, @fast_dials and @slow_dials. When you're happy, click or type Yes.
Use RSR_rigid to improve the fit to the density:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > rsr_rig m1 a2 a2 RSR> Refining zone A2 to A2 in molecule M1 , object M1 RSR> 7 atoms in zone RSR> 36 atoms in refinement box RSR> Old scale: 47.6289 ; new scale: 311.5877 RSR> Shifts for this group: RSR> # x y z rotx roty rotz megavalue RSR> 1 0.000 0.000 0.900 -8.000 17.000 -8.000 5.68652 RSR> 2 0.000 0.000 0.900 -8.000 11.000 -8.000 5.68830 RSR> 3 -0.100 -0.100 0.700 -14.000 11.000 -8.000 5.70376 RSR> 4 -0.200 -0.100 0.700 -14.000 11.000 -8.000 5.70476 RSR> 5 -0.100 -0.150 0.700 -14.000 11.000 -8.000 5.70840 O > yes ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Redraw the object (zo ; end). Perhaps the proline and asparagine are too far apart for the peptide bond to be drawn. Use Refi_zone to regularise the N-terminus of the model:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
O > re_zo m1 a2 a5 m1 yes
Refi > M1 A2 A5 M1
Refi > Refining zone A2 to A5 in molecule M1 , object M1
Refi > 563 lines read from dictionary
Refi > Number of cycles is 10
++++++++++
Refi > R.m.s.d. in bond lengths, angles, fixed diherals
Refi > 0.07 3.89 7.63
Refi > Centre of gravity updated for 1 5
Refi > Accept new coordinates? Hit *Yes/*No
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that the geometry is very poor in this region. Repeat the Refi_zone command a number of times until you get reasonable stereo-chemistry.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- ... Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.02 1.86 4.39 Refi > Centre of gravity updated for 1 5 Refi > Accept new coordinates? Hit *Yes/*No ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that the Refi_zone command keeps the first and last residues anchored.
In this case, that's a trifle unfortunate, since the density for the
proline is very poor and we are therefore not at all sure of its
exact location and orientation. But this will have to do for now.
Redraw the object and check if the peptide bond is drawn. If not,
regularise some more (or Move_zone the proline into a better
position/orientation and do the regularisation again).
Hint: use the Dist_define and Trig_refresh commands to monitor the
CA-CA distance of the first two residues (it should be 3.7-3.9 A).
(6) Rename the residues at the N-terminus. You may also want to reset the colours of the mutated residues. Save your model and database (note that we are saving our rebuilt model; so we will call the file "m2.pdb"). You may also want to update the quality indicator values for the rebuilt N-terminus (Pep_flip, RSC_fit and RS_fit).
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > zo ; end O > sam_rename Sam> What molecule [M1 ]: Sam> Residue range [all molecule]: a2 a3 Sam> NEW name of FIRST residue [a2 ]: a1 O > @omac/cnos_colours.omac O > Macro in computer file-system. O > Which molecule ? m1 O > O > O > zo ; end O > s_a_out m2.pdb m1 ;;;;; Sam> Coordinate file type assumed from file name is PDB Sam> 914 atoms written out. O > save ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
One of the worst pep-flip values occurs for residue A5 (a glycine). Execute the OOPS macro for this residue, draw the maps and symmetry objects.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > @oops/a5 O > Macro in computer file-system. As4> No object defined. As4> M1 A5 A5 M1 As4> Centering on zone from A5 to A5 O > As4> Residue GLY A5 O > As4> Bad pep-flip = 3.28 O > O > Macro in database. O > O > As4> Hit or type "@oops/a6" for next baddy O > @maps ... O > @symmy ... ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Is there something wrong with the peptide orientation ? Note that the carbonyl oxygen has good 2Fo-Fc density, but that there is positive difference density to support a flipped peptide. Also note that the overall fit to the density for residues A4 to A6 is poor. This may well be caused by strain introduced by an incorrect orientation of the peptide plane of the glycine. These two observations are sufficient reason for us to see if a flipped peptide might improve the model.
Use the Flip_pep command to flip the peptide plane of glycine A5.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > flip_pep a5 Mnp> M1 A5 CA M1 Mnp> Flipping peptide of residue A5 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Use the Move_zone command to move the glycine better into the density (and do the same for the neighbouring residues A4 and A6). Then use the Refi_zone command to regularise this part of the model.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > mo_zo a5 Mnp> No object defined. Mnp> M1 A5 A5 M1 Mnp> Fragment pivot point: 26.454 22.102 39.548 ... O > O > O > Mnp> Coordinates updated O > O > re_zo m1 a1 a10 m1 yes Refi > M1 A1 A10 M1 Refi > Refining zone A1 to A10 in molecule M1 , object M1 Refi > Number of cycles is 10 Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.02 1.68 2.93 Refi > Centre of gravity updated for 1 10 Refi > Accept new coordinates? Hit *Yes/*No ... Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.01 1.29 2.51 Refi > Centre of gravity updated for 1 10 Refi > Accept new coordinates? Hit *Yes/*No ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that the RMSD for fixed dihedrals is still fairly high. If this value remains high (> 2 degrees), even after repeated regularisation, this often indicates strain due to an as-yet unflipped peptide.
By the way, did you notice anything funny with respect to the peptide of residue A3 ? Calculate pep-flip values for residues A1 to A10.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > pep_fl m1 a1 a10 Util> M1 A1 A10 M1 Util> Calculating zone A1 to A10 in molecule M1 , object M1 ... Util> 20 fragments used for residue A3 pep_flip value= 1.02 Util> 16 fragments used for residue A4 pep_flip value= 2.77 Util> 20 fragments used for residue A5 pep_flip value= 1.38 Util> 20 fragments used for residue A6 pep_flip value= 0.67 Util> 20 fragments used for residue A7 pep_flip value= 0.48 Util> 20 fragments used for residue A8 pep_flip value= 0.88 Util> 20 fragments used for residue A9 pep_flip value= 0.98 Util> 20 fragments used for residue A10 pep_flip value= 0.94 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that residue A3 has a normal pep-flip value, but that the density seems to tell a different story. We will come back to this residue in the next section.
Residue A4 has a large pep-flip value. However, the 2Fo-Fc density is good, and there is no difference density which would indicate that the peptide might have to be flipped. Most importantly, however, the carbonyl oxygen hydrogen bonds to the sidechain of arginine C135 (their density features are connected). Sometimes, high pep-flip values are observed for residues which have an unusual orientation for a very good reason (in this case, in order to form a hydrogen bond). Such cases are NOT errors in the model; they are unusual (and sometimes crystallographically or biologically interesting) features of a model !
Execute the OOPS macro for tryptophan residue A7, etc.:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > @oops/a7 O > Macro in computer file-system. As4> No object defined. As4> M1 A7 A7 M1 As4> Centering on zone from A7 to A7 O > As4> Residue TRP A7 O > As4> Bad RSC = 3.02 O > O > Macro in database. O > O > As4> Hit or type "@oops/a8" for next baddy O > @maps ... O > @symmy ... ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
This residue has a very high RSC value. This doesn't always indicate
problems in the case of tryptophan and methionine residues in particular,
since there were very few observations of such residue types in the
original rotamer study of Ponder and Richards from which the O rotamers
are derived. However, in this case there are more suspicious features:
- there are two big blobs of positive difference density near the sidechain;
- the carbonyl oxygen of residue A3 looks as if it ought to rotated
by ~120 degrees so as to fit the density better. However, with the
present orientation of the tryptophan sidechain, this would lead to
a bad contact between the carbonyl oxygen and a carbon atom in the
ring.
Use the Lego_side_ch command to see if any of the tryptophan rotamers fits the density better than the present non-rotamer.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > lego_si_ch a7 Lego> M1 A7 CA M1 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
It is clear that rotamer number 1 (i.e., the most-frequently observed sidechain conformation for tryptophan residues) can easily be made to remedy both problems observed above: it explains the two blobs of difference density, and it would enable a hydrogen bond between its ring nitrogen atom and a better oriented carbonyl oxygen of A3 !
Accept the sidechain conformation by typing Yes (or clicking Yes on the menu).
Use Move_zone or RSR_rigid to move the tryptophan better into the density (in this particular case, the rotamer fits so well that we don't even have to adjust the sidechain torsions; however, often some fine-tuning of the chi torsion angles may be necessary - use the Tor_residue command to do that).
This is a very good example of the type of error that occurs very often at low resolution when databases are not used during rebuilding. Also note that an error in one place (the tryptophan) may introduce other errors (the peptide plane of A3). And the accumulation of many, in themselves small and local, errors make the difference between a good model and a poor one.
Rebuild the peptide of residue A3 as shown earlier, Refi_zone the first ten residues, recalculate Pep_flip, RSC_fit and RS_fit for these residues, and save the improved model. Hint: in the case of A3, flipping alone is not good enough. In this case you may want to use Move_fragm to correctly orient the peptide plane (click on the carbonyl carbon or oxygen to identify the fragment).
After the rebuild, you may find that both residue A3 and A4 now have high pep-flip values; however, in both cases there is a good reason for their being unusual.
Note: residues with polar or charged end-groups also often have non-rotamer conformations. If they are at the surface they are often disordered and have poor or no sidechain density (in such cases, put in a rotamer). If they have good density, they will usually be involved in saltlinks or hydrogen bonds, and the energy gain from that will outweigh the loss due to less favourable chi-torsion angle combinations.
Residue A9 is one which is currently alanine, but ought to be something else in the correct sequence of human CRABPII (in this case, iso-leucine).
Execute the relevant OOPS macro etc. Mutating a residue consists of two
steps:
- use Mutate_replace to assign the correct residue type. O will put it in
as the most common sidechain rotamer;
- use the normal rebuilding tools to fit the density (Lego_side_ch to get
the correct rotamer, Tor_residue to adjust sidechain torsions, sometimes
Move_zone or RSR_rigid, and finish with Refi_zone to regularise).
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > mut_repl Mut> Mutate a molecule by replacing one residue type Mut> by another. Mut> Molecule ([M1 ]) : Mut> Residue name and new type (<cr> to end) : a9 ile Mut> Residue name and new type (<cr> to end) : Mut> There are 1 mutations O > zo ; end O > le_si_ch a9 Lego> M1 A9 CA M1 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
In this case, rotamer number 4 is the most suitable one. Select it and make it fit the density better. In this case, RSR_rigid does a good job for the sidechain, but it distorts the mainchain. However, just a few cycles of Refi_zone will remedy that.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > RSR> Refining zone A9 to A9 in molecule M1 , object M1 RSR> 8 atoms in zone RSR> 46 atoms in refinement box RSR> Old scale: 334.1331 ; new scale: 341.3141 RSR> Shifts for this group: RSR> # x y z rotx roty rotz megavalue RSR> 1 -0.300 0.600 -0.300 12.000 -17.000 6.000 6.56684 RSR> 2 -0.300 0.400 -0.100 20.000 -21.000 10.000 6.71406 RSR> 3 -0.400 0.400 0.000 26.000 -23.000 14.000 6.75660 RSR> 4 -0.400 0.400 0.000 30.000 -23.000 14.000 6.76478 RSR> 5 -0.400 0.450 0.000 30.000 -23.000 14.000 6.76499 O > re_zo m1 a5 a15 m1 yes Refi > M1 A5 A15 M1 Refi > Refining zone A5 to A15 in molecule M1 , object M1 Refi > Unable to anchor atom CB in residue A5 Refi > Number of cycles is 10 Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.02 1.55 2.17 Refi > Centre of gravity updated for 5 15 Refi > Accept new coordinates? Hit *Yes/*No ... Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.01 1.00 1.41 Refi > Centre of gravity updated for 5 15 Refi > Accept new coordinates? Hit *Yes/*No ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Our starting model contains two breaks. Residues 100-106 are missing since they fitted the density poorly in the CRABPI structure (high temperature factors, poor density), and residues 114-117 are near the only insertion site in the sequence of CRABPII compared to that of CRABPI. We shall build the latter here; you may build the former yourself.
Centre on the relevant place and draw the maps and symmetry objects (to prevent accidental use of density belonging to a symmetry-related molecule):
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > ce_zo b113 c118 As4> No object defined. As4> M1 B113 C118 M1 As4> Centering on zone from B113 to C118 O > @maps ... O > @symmy ... ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that there is reasonable density for the missing mainchain and a number of sidechains. The missing residues are: Thr, Asn, Asp, Gly and Glu. The sidechain density for the Asn, Asp and Glu is quite reasonable.
(1) Use Mutate_insert to insert the missing residues into the sequence:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > mut_ins Mut> Mutate a molecule by inserting residues. Mut> Molecule ([M1 ]) : Mut> After which residue: b113 Mut> New residue name and type (<cr> to end) : b113a thr Mut> New residue name and type (<cr> to end) : b113b asn Mut> New residue name and type (<cr> to end) : b113c asp Mut> New residue name and type (<cr> to end) : b113d gly Mut> New residue name and type (<cr> to end) : b113e glu Mut> New residue name and type (<cr> to end) : Mut> There are 5 mutations ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(2) Now we must get coordinates for the CA atoms. You can use Baton to do this or Lego_loop. In this case, we shall use Lego_loop (don't forget to save all the datablocks of the current model before doing this !). The O manual and tutorial explain the use of this command in more detail.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > wr m1_* m1_save.odb ; O > select_on m1 ; O > sel_off m1 b113a b113e O > lego_loop m1 b110 c121 Lego> M1 B110 C121 M1 ... Lego> Number of selected atoms in zone is 8 Lego> DGNL> Top matches Lego> Protein Start Res. Score Sequence Lego> SGA_2 112 0.492 ATVNYGSSGIVYG Lego> SGA_2 93 0.514 VQRSGSTTGLRSG Lego> PTN_2 179 0.909 GPVVCSGKLQGIV Lego> APP_2 71 0.913 WSISYGDGSSASG Lego> OVO_1 133 0.925 RPVCGSDNKTYSN ... Lego> PA 23 1.324 VFRKAADDTWEPF ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(3) Type or hit On_off and click off the DB_CA object. Then use the dials to
check each of the hits in turn. Make sure not to select a loop which
puts mainchain atoms inside sidechain density (or the other way around).
In this case, loop number 5 fits fairly well, so select this one.
The fit can be improved somewhat. To this end, make a CA object and use Move_atom to move the CA atoms of the new residues better into place.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > obj ca ca b110 c121 end O > ce_at b113 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(3) Use Lego_auto_mc to generate the mainchain, and Lego_auto_sc to generate the sidechains. Regularise the region.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > le_au_mc m1 b110 c121 Lego> M1 B110 C121 CA Lego> Centre of gravity updated for 104 106 Lego> Centre of gravity updated for 107 109 Lego> Centre of gravity updated for 110 112 Lego> Centre of gravity updated for 113 114 O > le_au_sc m1 b113 c118 Lego> M1 B113 C118 CA Lego> SCGLY is missing. Lego> Unable to draw the rotamers. O > mol m1 zo ; end O > re_zo m1 b108 c123 m1 yes Refi > M1 B108 C123 M1 Refi > Refining zone B108 to C123 in molecule M1 , object M1 Refi > Number of cycles is 10 Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.02 2.07 3.38 Refi > Centre of gravity updated for 101 117 Refi > Accept new coordinates? Hit *Yes/*No ... Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.01 1.32 2.22 Refi > Centre of gravity updated for 101 117 Refi > Accept new coordinates? Hit *Yes/*No ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(4) Calculate pep-flip values and check the mainchain if necessary.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > pep_flip m1 b108 c123 Util> M1 B108 C123 M1 Util> Calculating zone B108 to C123 in molecule M1 , object M1 Util> 20 fragments used for residue B109 pep_flip value= 0.69 Util> 20 fragments used for residue B110 pep_flip value= 0.56 ... Util> 20 fragments used for residue C122 pep_flip value= 0.39 Util> 20 fragments used for residue C123 pep_flip value= 0.86 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(5) For each residue, rebuild it so it fits the density. Remember that the Lego_auto_sc commands uses the most-frequent rotamer for each residue; this is not always the correct one ! For instance, leucine 113 has the conformation of the second rotamer.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > ce_at b108 O > @maps ... O > O > Mnp> Fragment pivot point: 21.791 12.546 30.689 O > Mnp> Coordinates updated ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(6) Regularise the model (and re-select the entire molecule).
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > re_zo m1 b108 c123 m1 yes Refi > M1 B108 C123 M1 Refi > Refining zone B108 to C123 in molecule M1 , object M1 Refi > Number of cycles is 10 Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.04 3.14 3.12 Refi > Centre of gravity updated for 101 117 Refi > Accept new coordinates? Hit *Yes/*No ... Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.01 1.19 1.46 Refi > Centre of gravity updated for 101 117 Refi > Accept new coordinates? Hit *Yes/*No ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(7) Renumber the sequence. The initial model consisted of three disconnected pieces which were therefore called chain A, B and C. Since we have bridged the gap between B and C we can call all residues in these regions "B". Since there is an insertion in the sequence of CRABPII (and the model still had the old CRABPI residue numbers), all residues from 113 to the C-terminus need to be renumbered. Don't forget to save your model and O database. You may also want to calculate pep-flip values etc. for the rebuilt region.
Note: normally, one would first rebuild using the OOPS macros and then insert missing bits. In this case we don't do that, so it's not a good idea to rename the residues now (since then the OOPS macros will fail to centre on the correct residues). So, defer the renaming for the moment. If you were to do it, it would go like this:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > sam_rename Sam> What molecule [M1 ]: Sam> Residue range [all molecule]: b113 c135 Sam> NEW name of FIRST residue [b113 ]: b113 O > sam_lis Sam> Molecule name [M1 ]: Sam> Name Type From To Centre Radius Sam> A1 PRO 1 7 17.00 13.93 43.82 2.49 Sam> A2 ASN 8 15 20.77 16.93 43.72 3.07 ... Sam> A99 LEU 715 722 4.50 26.58 19.70 3.41 Sam> B107 THR 723 729 9.13 20.11 21.33 2.45 Sam> B108 ALA 730 734 10.88 17.69 24.18 1.97 Sam> B109 TRP 735 748 13.15 19.25 29.37 4.03 Sam> B110 THR 749 755 14.40 14.20 29.06 2.82 Sam> B111 ARG 756 766 17.23 17.61 32.30 4.51 Sam> B112 GLU 767 775 18.62 10.88 32.67 3.50 Sam> B113 LEU 776 783 20.66 11.26 38.55 3.36 Sam> B114 THR 784 790 22.72 9.01 36.35 2.52 ... Sam> B117 GLY 807 810 26.88 10.71 38.14 1.87 Sam> B118 GLU 811 819 26.91 10.31 34.59 3.87 Sam> B119 LEU 820 827 22.59 15.18 33.22 3.21 ... Sam> B135 VAL 936 942 30.31 15.06 33.97 2.88 Sam> B136 ARG 943 953 29.25 16.91 38.58 4.66 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note: in this case we built a fairly short stretch of residues with several long sidechains which could be fitted unambiguously. This is not always the case. Sometimes long residues have poor or no sidechain density, and for long insertions you cannot even always be sure if your model's sequence is in register with the density. Since register errors are often difficult to track down, and since having mainchain in sidechain density and vice versa may be hard to correct by the refinement program, you are advised to always build the mainchain first in such cases. In other words, assign all newly built residues to be alanines; the let the refinement program find the correct fit of the mainchain to the density and only then build in the correct sidechains (provided there is reasonably convincing density for them).
One residue is missing from the C-terminus of our model, namely Glu 137. Adding it is similar to inserting a residue. Since the previous model's C-terminus was Arg 135 (actually B136), it used to have X-PLOR OT1 and OT2 oxygens. In this case, these have been altered (OT1 renamed to O and OT2 removed) already, but in general you will have to do this yourself (also at chain breaks), since O doesn't like these atoms at all.
Go to residue 135, and draw the maps and symmetry objects. Then go through the insertion and rebuilding motions:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- O > mu_ins Mut> Mutate a molecule by inserting residues. Mut> Molecule ([M1 ]) : Mut> After which residue: c135 Mut> New residue name and type (<cr> to end) : c136 arg Mut> New residue name and type (<cr> to end) : Mut> There are 1 mutations O > mer_at Sam> Merge from molecule name, and zone: m1 c135 c135 Sam> Merge to molecule name and start residue: m1 c136 Sam> Datablock containing transformation [<cr> identity]: Sam> 11 atoms Sam> 11 updated. Sam> Centre of gravity updated for 130 130 O > mut_repl Mut> Mutate a molecule by replacing one residue type Mut> by another. Mut> Molecule ([M1 ]) : Mut> Residue name and new type (<cr> to end) : c136 glu Mut> Residue name and new type (<cr> to end) : Mut> There are 1 mutations O > zo ; end O > mo_zo c136 Mnp> No object defined. Mnp> M1 C136 C136 M1 Mnp> Fragment pivot point: 30.326 17.328 38.215 Mnp> Database compressed. Mnp> Compression caused by.save_col O > O > O > O > O > O > O > O > Macro in database. O > O > Mnp> Coordinates updated O > O > rsr_rigid m1 c136 c136 m1 yes ... O > re_zo m1 c130 c136 m1 yes ... Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.02 1.24 1.55 Refi > Centre of gravity updated for 124 130 Refi > Accept new coordinates? Hit *Yes/*No ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Start your own rebuilding session by executing the macro "oops.omac" and apply what you have learned above. Remember to save (and backup) your O file regularly. Before making big changes, it is a good idea to write out your molecule to a temporary file ("write m1_* m1_save.odb ;") just in case you (or O) screw up. Also try to build (at least a poly-Ala) for the other missing loop (100-106; you may need to contour the maps at a lower level than usual). When you're done rebuilding also put in the correct sidechains for residues which are different in CRABPI and II (if there is sidechain density, of course).
If you have had/learned enough, you can finish with comparing your rebuilt model with the actual 1.8 Å crystal structure (PDB code 1CBS). Look at some of the places with large differences in mainchain conformation. Also check out some of the sidechains which you modeled differently; what does the 2.8 Å density look like ? Can you find clear examples of model bias (i.e., deceptively convincing density for a completely wrong sidechain conformation) ? What does this teach you about low-resolution models ? Discuss the oft-read phrase "the coordinate error is 0.2 Å (Luzzati, 1952)" when used to describe the "accuracy" of low-resolution models.
If you want to continue with the tutorial, don't forget to renumber your residues (from A1 to A137).
You may also simply skip the refinement part of the tutorial and go to the "New model" section of the Post-refinement chapter.
(1) Find out what the effect was of the pep-flip of residue A5 on the
position of A6 in the Ramachandran plot. Explain your observations.
Knowing this, if you didn't change the peptide of A47, check it again.
(2) Why don't we add any water molecules to the model at this stage ?
(3) What could you do to remedy residues with a poor peptide improper ?
And those with a poor CA-chirality virtual torsion ? Why are these
torsions called "improper" or "virtual" ?
(4) How did you build Arg A11 ?
(5) Did you change Asp A13 ? The density fit is poor and it has a
non-rotamer sidechain conformation. There is difference density nearby
which fits one of the rotamers and which would enable a saltlink to Arg
A11 NE. Asp A17 has a similar problem, but (model bias ?) the density
for the current sidechain is deceptively good.
(6) Did you change the sidechain of Leu A18 ? At low resolution, very often
"awkward" sidechain conformations of leucines can be replaced by a rotamer
which fits the density equally well or better. At high resolution,
non-rotamer leucines are rare ! Also check any other leucines which are
not in a rotamer conformation.
(7) Why do we have symmetry objects on whenever we (re)build a residue ?
(8) How and why did you rebuild Met A27 ?
(9) Find out the chemical formula for the ligand, all-trans-retinoic acid.
Have you seen any traces of density for this ligand in the maps ?
(10) Did you rebuild Ile A52 ? Check that, although this residue has a
reasonable RSC value, rotamer 1 fits the density just as well and, in
addition, explains a peak in the difference density.
(11) If you did Refi_zone on a zone which contained a cysteine (e.g., A81),
you probably got an error message. This is because the CYS entry in the
O Refi dictionary is for a disulfide. You can edit the dictionary file,
remove the O datablock (which one ?) from your database, and use the
Refi_setup command to point to the new file. Something similar happens
for proline residues. What would happen if you used Refi_zone on a
cis-proline ? How could you remedy this ?
(12) Use the Lsq commands to find the RMSD on CA atoms between your rebuilt
model and the starting model.
(13) In directory gmrp/o/gerard, you'll find the result of my own quick and
dirty rebuilding ("m2_gerard.pdb"). Calculate the RMSD between your
model and this one. Are there any major differences between the two
models ? Check the density in these places. Note that "m2_gerard.pdb"
is still incomplete and still contains several errors (both in the
mainchain and in the sidechains).
We shall prepare the model (from the file "m2.pdb") for refinement with X-PLOR.
We can do most of the work with MOLEMAN:
- read the PDB file (command: READ);
- generate chain and X-PLOR segment names (commands: AUTO or ASK_);
- correct sidechain atom naming if necessary (commands: CHECk or CORRect);
- since individual temperature factors are usually not a good idea at 2.8 Å,
average the Bs so we get two Bs per residue, one for mainchain and one
for sidechain atoms (command: AVER);
- reset very low or high Bs and set all occupancies to one (command: LIMIt);
- write the PDB file; often, you will have more than one segment, so the
SPLIt command is handy to use (this will write each segment to a separate
PDB file and will not write PDB records that X-PLOR doesn't like);
- we need coordinates for the carboxy-terminal oxygen OT2; these can be
calculated with the SUGGest command, but you MUST edit the file to add
this atom (and to rename the carbonyl oxygen of the C-terminal residue
to OT1) !!
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Input PDB file ? (in.pdb) m2.pdb Number of lines read : ( 1061) Number of atoms now : ( 1051) CPU total/user/sys : 2.7 2.7 0.0 Option ? (READ_pdb_file) auto Generating chain and segids ... New chain A, segid AAAA @ residue 1 Nr of segments found : ( 1) Option ? (AUTO) corr Nr of atoms : ( 1051) Nr of residues : ( 137)Error in TYR 134 ... Swapped CD1/2 and CE1/2
# of PHE checked : 5 # errors : 0 # of TYR checked : 2 # errors : 1 # of ASP checked : 5 # errors : 0 # of GLU checked : 13 # errors : 0 # of ARG checked : 7 # errors : 0 WARNING - any attached hydrogens NOT renamed Option ? (CORR) aver Valid options are: 1. Average over all atoms (i.e., compute Boverall) 2. Average per residue over all atoms 3. Average per residue, separately for main and side-chain 4. Average corresponding atoms in different chains
Option ? ( 1) 3 Res 1 Nr_atoms 7 Bave-MC 19.49 ( 4) Bave-SC 23.19 ( 3) Res 2 Nr_atoms 8 Bave-MC 20.00 ( 4) Bave-SC 20.00 ( 4) ... Res 136 Nr_atoms 11 Bave-MC 9.61 ( 4) Bave-SC 17.89 ( 7) Res 137 Nr_atoms 9 Bave-MC 20.00 ( 4) Bave-SC 20.00 ( 5) Nr of temperature factors updated : ( 1051) Option ? (AVER) limit Enter MIN and MAX temperature factor : ( 2.000 99.900) 5 50 Enter MIN and MAX occupancy : ( 0.000 1.000) 1 1 Residue range to apply (0 0 = all molecule) ? ( 0 0) Nr of atoms updated : ( 1051) Option ? (LIMIt) split
Basename of PDB files ? (out) ../xplor/m2 New chain id : ( A) New pdb file : (../xplor/m2a.pdb) Nr of atoms written to it : ( 1051) Nr of atoms written in core : ( 1051) CPU total/user/sys : 3.0 2.9 0.1 Option ? (SPLIt) sugg Which residue number ? ( 1) 137 ... found N ... Dihedral CA-OT1-C-OT2 = ( 180.000)
==> OT1 NOW : 35.010 19.289 39.046 SUGGESTED : 35.063 19.267 39.037 ==> OT2 NOW : 0.000 0.000 0.000 SUGGESTED : 36.362 19.062 37.328
Check geometry of carboxylate group : Dist C-OT1 = ( 1.230) ... Dih CA-OT1-C-OT2 = ( 180.000) ==> YOU MUST ADD/EDIT OT1/OT2 YOURSELF !!!
Option ? (SUGG) quit ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Go to the directory gmrp/hkl. You will find two reflection files, "crabp2_1.8a.hkl" and "crabp2_2.8a.hkl". We will ignore the former for the time being.
Use DATAMAN to convert these reflections into an X-PLOR reflection file and to generate Rfree flags for a subset of the reflections. First read the hkl file:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- DATAMAN > re m1 crabp2_2.8a.hkl File : (crabp2_2.8a.hkl) Type : (HKLFS) Format : (*) Nr of reflections read : ( 3816) Nr of WORK reflections : ( 3816) Nr of TEST reflections : ( 0) Percentage TEST data : ( 0.000) This is NOT an Rfree dataset WARNING - less than 500 TEST reflections ! DATAMAN > stats m1 Stats : (M1)Item Minimum Maximum Average Sdv Var ==== ======= ======= ======= === === H 0 16 6.816 3.793 14.384 K 0 16 6.376 4.314 18.609 L 0 27 9.457 6.528 42.619 Fobs 1.107E+02 2.608E+04 4.723E+03 3.090E+03 9.551E+06 SigFo 2.280E+01 2.106E+03 1.058E+02 7.459E+01 5.563E+03 Fo/Sig 1.381E+00 1.412E+02 5.110E+01 2.765E+01 7.644E+02
Correlation Fobs-SigFo : ( 0.302) Correlation Fobs-Fo/Sig : ( 0.626) Correlation SigFo-Fo/Sig : ( -0.330)
Nr of reflections : ( 3816) Nr of WORK reflections : ( 3816) Nr of TEST reflections : ( 0) Percentage TEST data : ( 0.000) This is NOT an Rfree dataset WARNING - less than 500 TEST reflections ! ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Now we can generate Rfree flags. Use the following rules-of-thumb:
- use 5-10% of the reflections, but not fewer than ~500 test reflections, and
not more than ~2000;
- if there is NCS, generate test reflections in thin shells (RFree SHell
command); otherwise use small spheres in reciprocal space (RFree SPheres
command).
In this case (~3800 reflections), we will generate a test set of 500:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- DATAMAN > cell m1 45.65 47.56 77.61 90 90 90 Cell : ( 45.650 47.560 77.610 90.000 90.000 90.000) Volume (A3) : ( 1.685E+05) DATAMAN > cal m1 res Calc : (M1) Cell volume : ( 1.685E+05) Lowest resolution : ( 14.932) Highest resolution : ( 2.800) DATAMAN > rf sph Which set ? (M1) Percentage TEST data ? (10) 500 Converted to percentage : ( 13.103) Reciprocal sphere radius ? (1) Encoding reflections ... Nr of TEST spheres : ( 82) Nr of WORK reflections : ( 3315) Nr of TEST reflections : ( 501) Percentage TEST data : ( 13.129) This is an Rfree dataset WARNING - more than 13% TEST reflections ! DATAMAN > st m1 Stats : (M1)Item Minimum Maximum Average Sdv Var ==== ======= ======= ======= === === H 0 16 6.816 3.793 14.384 K 0 16 6.376 4.314 18.609 L 0 27 9.457 6.528 42.619 Fobs 1.107E+02 2.608E+04 4.723E+03 3.090E+03 9.551E+06 SigFo 2.280E+01 2.106E+03 1.058E+02 7.459E+01 5.563E+03 Reso 2.800 14.932 4.072 1.615 2.607 Fo/Sig 1.381E+00 1.412E+02 5.110E+01 2.765E+01 7.644E+02
Correlation Fobs-SigFo : ( 0.302) Correlation Fobs-Fo/Sig : ( 0.626) Correlation SigFo-Fo/Sig : ( -0.330)
Nr of reflections : ( 3816) Nr of WORK reflections : ( 3315) Nr of TEST reflections : ( 501) Percentage TEST data : ( 13.129) This is an Rfree dataset WARNING - more than 13% TEST reflections ! ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Finally, write the reflections to a file in X-PLOR format with the Rfree flags included:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
DATAMAN > wr m1 crabp2_2.8a_rfree.xplor rxplor
Nr of WORK reflections : ( 3315)
Nr of TEST reflections : ( 501)
Percentage TEST data : ( 13.129)
This is an Rfree dataset
WARNING - more than 13% TEST reflections !
File : (crabp2_2.8a_rfree.xplor)
Type : (RXPLOR)
Format : ((' INDEX=',3i6,' FOBS=',f10.3,' SIGMA=',f10.3,' TEST=',i3))
Write WORK and TEST set
Nr of reflections stored : ( 3816)
Nr of reflections written : ( 3816)
CPU total/user/sys : 1.3 1.0 0.3
DATAMAN > quit
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Go to the gmrp/xplor directory. You will find that a number of files are already there for you. A few of the problem-specific files (may) need to be edited:
(1) "crystal.xplor"
Enter the unit cell constants and spacegroup symmetry operators. Note that
anything in { curly brackets } is treated as a comment by X-PLOR. In all
example files, lines which may need to be edited by you are indicated by:
{ *** EDIT ME *** }. It should look as follows:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
{ crystal.xplor }
{ unit cell for holo-CRABP II crystal }
a=45.65 b=47.56 c=77.61 { *** EDIT ME *** }
alpha=90. beta=90.0 gamma=90. { *** EDIT ME *** }
{ symmetry operators for spacegroup P212121 } { *** EDIT ME *** }
symmetry=(x,y,z)
symmetry=(-x+1/2,-y,z+1/2)
symmetry=(-x,y+1/2,-z+1/2)
symmetry=(x+1/2,-y+1/2,-z)
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(2) "charges.xplor"
Does not need to be edited normally.
(3) "reflxns.xplor"
Edit the name of your reflection file. Note that REMARK lines will be
echoed to your output PDB files. It should look as follows:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
{ reflxns.xplor }
{ read reflections }
nreflections=100000
reflection @../hkl/crabp2_2.8a_rfree.xplor end { *** EDIT ME *** }
REMARK Uses 13% Rfree reflection file in P212121 holo-CRABPII
{ resolution range }
resolution $lo_res $hi_res
{ two-sigma and F-magnitude cutoff }
reduce
{ do amplitude ( fobs = fobs * heavy(fobs - 2.0*sigma)) }
{ fwindow 0.001 1000000 }
REMARK Uses *NO* sigma or amplitude cut-off
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(4) "parameters.xplor"
Normally doesn't need to be edited, unless you start introducing non-protein
entities (waters, ligand, carbohydrates, metal ions, etc.). It may look
as follows:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
{ parameters.xplor }
parameter
@parhcsdx.pro
{ @param19.sol }
nbonds
atom cdie shift eps=8.0 e14fac=0.4
cutnb=7.5 ctonnb=6.0 ctofnb=6.5
nbxmod=5 vswitch wmin=0.5
end
remark dielectric constant set to 8.0 (EPS)
remark using UPDATED Engh & Huber parameters parhcsdx.pro
remark close contacts printed only if dist < 0.5 A (WMIN)
end
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Other files include:
- topology files ("top*");
- parameter files ("par*");
- X-PLOR input files ("*.inp");
- "printr.xmac", an X-PLOR "macro" which updates Fcalc and calculates
and prints the R-factors;
- "rfree.csh", a command script that generates plots of R and Rfree versus
progress of refinement (requires the utility programs ODBM and O2D).
The topology specifies for each residue which atoms it contains, their charges, bonds etc. The standard file for proteins is "tophcsdx.pro".
The parameters specify target values for bond lengths etc. and energy penalties associated with deviations from the ideal values. The standard file for proteins is "parhcsdx.pro".
If you have other entities, you may need to create new topology and parameter files. XPLO2D can do a major part of this job automatically when you feed it a PDB file of a small molecule (option AUTODICT). This will be used in the chapter "Another cycle".
(1) Why is it not necessarily a good idea to use individual temperature
factors with 2.8 Å data ? How could you test if they are appropriate
(i.e., better than grouped Bs) ? How many B-parameters are refined for
your current model if you refine individual Bs ? And how many if you
refine two Bs per residue (note glycines) ?
(2) What has O done to the temperature factors of mutated and newly inserted
residues ?
(3) How many reflections will be used for the actual refinement ? How many
parameters (coordinates and Bs) are there in your present model ? What
does this mean ?
(4) Why do we need a minimum number of test reflections (i.e., ~500) ? And
why is there a recommended maximum ?
(5) Why is random selection of test reflections the worst possible choice ?
(6) Why didn't we write out the reflections from DATAMAN using the XPLOR
format specifier (we used RXPLOR instead) ?
(7) How does X-PLOR handle cis- and trans-prolines ?
In this chapter, we shall work through one round of X-PLOR refinement.
The first step in a refinement is to generate a so-called PSF file, as well as a PDB file which contains all atoms (including polar hydrogens). To do this, you need to edit the file "generate.inp". When you're done, submit the job, e.g.:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- unix> /public/bin/xplor_16000 < generate.inp |& tee generate.out ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The next step is to determine the relative weight of the X-ray pseudo-energy term compared to the combined geometric and other energetic terms.
Edit the file "check.inp" to do this and submit the job. At the end of the output will be something like "Ideal WA=0.123456E+06". This means that the best weight is 123456; however, in practice it turns out that this weight is often too high; use 1/2 or 1/3 of this value in subsequent refinement jobs.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
X-PLOR> xrefine
XREFINE> resolution $lo_res $hi_res gradient
XREFIN: selected reflections will be sorted by index.
XRTEST: number of selected reflections 3678
XRFILL: #scatt.= 1052 #anomalous= 0 #special pos.= 0 occupancies=1
XFFT: using grid [ 48, 50, 90] and sublattice [ 48( 49), 50( 51), 90]
TRRESI: ->[TEST SET (TEST=1)] Fobs/Fcalc scale= 17.222 R= 0.340
TRRESI: ->[WORKING SET (TEST=0)] Fobs/Fcalc scale= 16.979 R= 0.389
XRGRAD: r.m.s. gradients: empirical energy function= 76.649
"amplitude" target= 0.55947E-03
"phase" target= 0.00000E+00
XRGRAD: ideal WA= 0.13700E+06
XREFINE> end
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that this job also prints the initial R-factors and the completeness of the test and work reflections in the selected resolution range.
In the first refinement cycle, you may want to optimise the overall orientation of your molecule(s) with rigid-body refinement.
Edit and submit the "rigid.inp" job to do this (in this case it's not necessary).
Edit and submit the "powell.inp" job. At the start:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- TRRESI: ->[TEST SET (TEST=1)] Fobs/Fcalc scale= 17.515 R= 0.323 TRRESI: ->[WORKING SET (TEST=0)] Fobs/Fcalc scale= 17.323 R= 0.354 --------------- cycle= 1 ------ stepsize= 0.0000 ----------------------- | Etotal =33953.486 grad(E)=5481.256 E(BOND)=113.216 E(ANGL)=608.934 | | E(DIHE)=644.982 E(IMPR)=78.656 E(VDW )=1490.546 E(ELEC)=-363.305 | | E(XREF)=6632.591 E(PVDW)=24753.167 E(PELE)=-5.300 | ------------------------------------------------------------------------------- ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that Rfree is lower than R, but that is because we the starting model was refined against it, but using a different partitioning of test and work reflections.
At the end:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- TRRESI: ->[TEST SET (TEST=1)] Fobs/Fcalc scale= 17.604 R= 0.315 TRRESI: ->[WORKING SET (TEST=0)] Fobs/Fcalc scale= 17.883 R= 0.286 --------------- cycle= 100 ------ stepsize= 0.0001 ----------------------- | Etotal =4465.241 grad(E)=20.967 E(BOND)=92.023 E(ANGL)=354.931 | | E(DIHE)=480.623 E(IMPR)=97.398 E(VDW )=-449.613 E(ELEC)=-373.626 | | E(XREF)=4317.535 E(PVDW)=-47.850 E(PELE)=-6.179 | ------------------------------------------------------------------------------- ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Edit and submit the "anneal.inp" job. While this job runs, you can edit the "rfree.csh" file and execute it. This will produce a PostScript plot of the behaviour of R and Rfree as a function of the progress of refinement. You can view the plot with a program like GhostScript or GhostView, or print it on a PostScript printer.
The final R-factors etc. are:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- TRRESI: ->[TEST SET (TEST=1)] Fobs/Fcalc scale= 17.672 R= 0.318 TRRESI: ->[WORKING SET (TEST=0)] Fobs/Fcalc scale= 18.301 R= 0.245 --------------- cycle= 50 ------ stepsize= 0.0000 ----------------------- | Etotal =3185.799 grad(E)=5.270 E(BOND)=63.926 E(ANGL)=326.876 | | E(DIHE)=488.317 E(IMPR)=70.329 E(VDW )=-510.423 E(ELEC)=-376.621 | | E(XREF)=3188.080 E(PVDW)=-59.017 E(PELE)=-5.670 | ------------------------------------------------------------------------------- ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that Rfree has increased slightly. In this particular case it need not worry us, since this is our first refinement in which we had to uncouple R and Rfree.
For this you can use either of three input files: "bindiv.inp" to refine restrained individual isotropic temperature factors, "bgroup2.inp" to refine two Bs per residue, or "bgroup1.inp" to refine only one B per residue. Select the most appropriate B-factor model, edit the corresponding file and submit the job. You could try all three to find out which method yields the most appropriate B-factor model, but in this case we will only use "bgroup2.inp". Note that R and Rfree both drop by ~1%, which is a good sign.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- TRRESI: ->[TEST SET (TEST=1)] Fobs/Fcalc scale= 17.422 R= 0.309 TRRESI: ->[WORKING SET (TEST=0)] Fobs/Fcalc scale= 17.986 R= 0.235 --------------- cycle= 25 -------------------------------------------------- | E(XREF)= 0.293E+04 grad(E)= 0.509E-02 | ------------------------------------------------------------------------------- ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
(1) Explain why Rfree can be lower than R at the start of a refinement,
as well as after previous refinement using a different partitioning
of work and test reflections.
(2) If you run rigid-body refinement, what resolution limits would you
use (and why) ? If they are different from those used in the other
refinement jobs, would you need to run "check.inp" again to find the
most appropriate value of WA for that resolution range ? Why (not) ?
(3) How could you decide what the best B-factor model is for your model
and dataset ?
To analyse the geometry of your model and to find any bad (symmetry) contacts, edit and submit the job "geom.inp".
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- X-PLOR> print threshold=0.05 bonds ... Number of violations greater 0.050: 1 RMS deviation= 0.008 X-PLOR> X-PLOR> print threshold=10.0 angles ... Number of violations greater 10.000: 2 RMS deviation= 1.460 X-PLOR> X-PLOR> print threshold=60.0 dihedrals ... Number of violations greater 60.000: 0 RMS deviation= 27.774 X-PLOR> X-PLOR> print threshold=5.0 impropers ... Number of violations greater 5.000: 2 RMS deviation= 1.267 ... X-PLOR> distance from=( not hydrogen ) to=( not hydrogen ) cutoff=2.5 end SELRPN: 1052 atoms have been selected out of 1290 SELRPN: 1052 atoms have been selected out of 1290 DISTAN: nonbonded distances printed atoms "AAAA-75 -THR -OG1 " and "AAAA-77 -ASP -OD1 " 2.4944 A apart atoms "AAAA-89 -SER -N " and "AAAA-89 -SER -O " 2.4665 A apart X-PLOR> ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
So, our model is tightly restrained, as it should be at low resolution.
With MOLEMAN we can prepare a suitable PDB file for O and CCP4 from the
final X-PLOR model:
- read the file and strip hydrogens (command: NO_H)
- list some statistics (commands: STAT and B_Q_)
- assign chain names (commands: AUTO or ASK_)
- add cell and spacegroup information (command: CRYS)
- get correct sidechain atom names (commands: CHECk and CORRect)
- optionally, you can produce all sorts of plots (commands: PLOT, RAMA,
CA_Rama, RADIal, CA_D, BALA)
- write the new PDB file (command: WRITe)
Copy the new model to your O directory.
If you have skipped the refinement part of the tutorial, copy the file "m3_gerard.pdb" from the directory gmrp/o/gerard and continue from here as if nothing had happened.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Option ? (READ_pdb_file) no_hInput PDB file ? (in.pdb) m1_final.pdb Number of lines read : ( 1294) Hydrogens skipped : ( 238) Number of atoms now : ( 1052) Option ? (NO_H) stat Nr of atom numbers in memory : ( 1052)
Item Average St.Dev Min Max ---- ------- ------ --- --- X-coord 18.017 8.058 0.641 35.795 Y-coord 20.647 6.665 3.481 34.742 Z-coord 27.566 9.843 2.677 46.340 B-factor 11.675 9.850 5.000 61.360 Occpncy 1.000 0.000 1.000 1.000
Radius of gyration (A) : 14.36 Sum of masses : 13930.688 Centre-of-mass : 18.00 20.65 27.54 Option ? (STAT) b_q_ Amino acid residue names ? (ALA ARG ASN ASP CYS GLN GLU GLY HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL CPR ASX GLX UNK CYH CSS PCA) Names of ligands/substrates ? (???) Which chain (** = all) ? (**) Include HYDROGEN atoms (Y/N) ? (N)
B & Q statistics for chain : (**)
Atom type Number Average B Maximum B Average Q Protein main chain 549 10.385 61.360 1.000 Protein side chain 503 13.083 57.730 1.000 Protein all atoms 1052 11.675 61.360 1.000 Ligand/substrate 0 0.000 0.000 0.000 Water molecules 0 0.000 0.000 0.000 Other entities 0 0.000 0.000 0.000 All atoms 1052 11.675 61.360 1.000 Generating REMARK records ...
Option ? (B_Q_) auto Generating chain and segids ... New chain A, segid AAAA @ residue 1 Nr of segments found : ( 1) Option ? (AUTO) crys Unit-cell constants ? ( 1.000 1.000 1.000 90.000 90.000 90.000) 45.65 47.56 77.61 90 90 90 Unit-cell volume (A3) : ( 1.685E+05) Spacegroup ? (P 1) P 21 21 21 Value of Z ? ( 1) 4 Option ? (CRYS) corr Nr of atoms : ( 1052) Nr of residues : ( 137)
Error in GLU 16 ... Swapped OE1/2 ... # of PHE checked : 5 # errors : 1 # of TYR checked : 2 # errors : 1 # of ASP checked : 5 # errors : 0 # of GLU checked : 13 # errors : 5 # of ARG checked : 7 # errors : 0 WARNING - any attached hydrogens NOT renamed Option ? (CORR) rama
In the following, hit RETURN if you do NOT want to produce the file the programs asks for
Text file with PHI-PSIs ? ( ) O2D Ramachandran plot file ? ( ) O PHI-PSI datablock file ? ( ) HPGL Ramachandran plot file ? ( ) PostScript Ramachandran plot file ? ( ) m3_rama.ps => XPS_GRAF - GJK (2.2 @ 950530) Opened PostScript file : (m3_rama.ps) Date : (Thu Nov 16 19:38:46 1995) User : (gerard) Program : (MOLEMAN) PostScript POLAR Ramachandran plot file ? ( ) Option ? (RAMA) plot
Make plot file for Bs or Qs ? (B) Filename for per_atom plot ? (atom_b.plt) q Filename for per_residue plot ? (resi_b.plt) m3_aveb.plt
You may plot the following for each residue: R = RMS B/Q over all atoms / average over molecule A = average B/Q for all atoms M = average B/Q for main-chain atoms S = average B/Q for side-chain atoms Option (R/A/M/S) ? (A) Write atom/residue labels to file (Y/N) ? (N) WARNING - if there are hydrogen atoms they will be included ! Option ? (PLOT) radi Plot file ? (b_radial.plt) m3_radb.plt Which chain (2 characters !) ? ( A) Nr of atoms selected (no Hs) : ( 1052) Shell 2.0 - 4.0 A - 6 atoms; <B> = 6.16 A**2 Shell 4.0 - 6.0 A - 22 atoms; <B> = 8.12 A**2 Shell 6.0 - 8.0 A - 45 atoms; <B> = 6.51 A**2 Shell 8.0 - 10.0 A - 103 atoms; <B> = 7.07 A**2 Shell 10.0 - 12.0 A - 164 atoms; <B> = 7.87 A**2 Shell 12.0 - 14.0 A - 191 atoms; <B> = 8.15 A**2 Shell 14.0 - 16.0 A - 209 atoms; <B> = 10.90 A**2 Shell 16.0 - 18.0 A - 171 atoms; <B> = 15.97 A**2 Shell 18.0 - 20.0 A - 85 atoms; <B> = 18.29 A**2 Shell 20.0 - 22.0 A - 40 atoms; <B> = 27.43 A**2 Shell 22.0 - 24.0 A - 11 atoms; <B> = 35.55 A**2 Shell 24.0 - 26.0 A - 3 atoms; <B> = 27.91 A**2 Shell 26.0 - 28.0 A - 2 atoms; <B> = 31.29 A**2 Plot file written Option ? (RADI) write
Output PDB file ? (out.pdb) m3.pdb REMARK at start of file ? (MoleMan PDB file) M3 X-PLOR R 0.235 Rfree 0.309 951116 Copy all REMARK, HEADER etc. cards from input ? (Y) Which chain to write (** = any and all) ? (**) Residue range to write (0 0 = all molecule) ? ( 0 0) You may output All atoms, only Main-chain atoms, a Poly-alanine (Gly intact), a poly-Serine, (Gly and Ala intact) or a poly-Glycine Which option do you want (All/M/P/S/G) ? (A) Write HYDROGEN atoms (Y/N) ? (N) Force consecutive atom numbering (Y/N) ? (Y) X-PLOR needs OT1 and OT2, but O hates them If your file contains OT1/2 you may either keep them, or replace them by O/OXT Write X-PLOR OT1/2 ? (Y/N) ? (N) y Cell : ( 45.650 47.560 77.610 90.000 90.000 90.000) CCP4 requires CRYST, SCALE and ORIGX cards X-PLOR does not like them at all Therefore: reply Y for CCP4 and N for X-PLOR : Write CRYST, SCALE, ORIGX cards (Y/N) ? (Y) Nr of atoms written : ( 1052) Option ? (WRITe) quit ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The plot files can be converted into PostScript format with the program O2D or the script "OMAC/o2dps".
If you want, you can also run ProCheck again at this stage. It will probably tell you once again that this is a fantastic model. However, it also said this about the initial model, which you know was rather poor. The most useful output from ProCheck is the Ramachandran plot and the distribution of chi1/chi2 angles (similar information as with the O RSC_fit command).
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
R A M A C H A N D R A N P L O T S T A T I S T I C S
Residues in most favoured regions [A,B,L] 105 84.0%
Residues in additional allowed regions [a,b,l,p] 19 15.2%
Residues in generously allowed regions [~a,~b,~l,~p] 1 0.8%
Residues in disallowed regions [XX] 0 0.0%
---- ------
Number of non-glycine and non-proline residues 125 100.0%
Number of end-residues (excl. Gly and Pro) 1
Number of glycine residues 7
Number of proline residues 4
----
Total number of residues 137
...
S T E R E O C H E M I S T R Y O F M A I N - C H A I N
Comparison values No. of
No. of Parameter Typical Band band widths
Stereochemical parameter data pts value value width from mean
------------------------ -------- ----- ----- ----- ---------
a. %-tage residues in A, B, L 125 84.0 70.9 10.0 1.3 BETTER
b. Omega angle st dev 136 1.3 6.0 3.0 -1.6 BETTER
c. Bad contacts / 100 residues 1 0.7 15.8 10.0 -1.5 BETTER
d. Zeta angle st dev 130 1.3 3.1 1.6 -1.1 BETTER
e. H-bond energy st dev 90 0.8 1.0 0.2 -1.1 BETTER
f. Overall G-factor 137 0.2 -0.7 0.3 2.9 BETTER
...
S T E R E O C H E M I S T R Y O F S I D E - C H A I N
Comparison values No. of
No. of Parameter Typical Band band widths
Stereochemical parameter data pts value value width from mean
------------------------ -------- ----- ----- ----- ---------
a. Chi-1 gauche minus st dev 28 11.5 25.4 6.5 -2.1 BETTER
b. Chi-1 trans st dev 37 12.6 24.9 5.3 -2.3 BETTER
c. Chi-1 gauche plus st dev 43 14.3 23.5 4.9 -1.9 BETTER
d. Chi-1 pooled st dev 108 13.1 24.3 4.8 -2.3 BETTER
e. Chi-2 trans st dev 37 12.9 24.7 5.0 -2.4 BETTER
...
G - F A C T O R S
Average
Parameter Score Score
--------- ----- -----
Dihedral angles:-
Phi-psi distribution -0.43
Chi1-chi2 distribution -0.18
Chi1 only -0.30
Chi3 & chi4 -0.35
Omega 0.57
------ -0.05
=====
Main-chain covalent forces:-
Main-chain bond lengths 0.64
Main-chain bond angles 0.37
------ 0.49
=====
OVERALL AVERAGE 0.17
=====
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
To find out how much and where X-PLOR refinement has changed the model, you can use some of the tools in LSQMAN. For example, the RMSD on CA atoms:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- LSQMAN > re m2 m2a.pdb Old chain |A| becomes chain A Nr of lines read from file : ( 1053) Nr of atoms in molecule : ( 1052) Nr of chains or models : ( 1) Stripped hydrogen atoms : ( 0) LSQMAN > re m3 m3.pdb Cell : ( 45.650 47.560 77.610 90.000 90.000 90.000) Old chain |A| becomes chain A Nr of lines read from file : ( 1080) Nr of atoms in molecule : ( 1052) Nr of chains or models : ( 1) Stripped hydrogen atoms : ( 0) LSQMAN > ex m2 a1-199 m3 a1 Explicit fit of M2 A1-199 And M3 A1 Atom types | CA | B-factor range used: -1000.00 - 10000.00 A2 Nr of atoms to match : ( 137) Nr skipped (B limits) : ( 0)The 137 atoms have an RMS distance of 0.413 A RMS delta B = 5.909 A2 Corr. coeff. = 0.6545 Rotation : 0.999993 -0.003368 -0.001592 0.003365 0.999993 -0.001814 0.001598 0.001808 0.999997 Translation : -0.117 0.098 0.001 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The RMSD on all atoms:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- LSQMAN > at all Nr of atom types : ( 1) Type : (ALL) LSQMAN > ex m2 a1-199 m3 a1 Explicit fit of M2 A1-199 And M3 A1 Atom types |ALL | B-factor range used: -1000.00 - 10000.00 A2 Nr of atoms to match : ( 851) Nr skipped (B limits) : ( 0)The 851 atoms have an RMS distance of 0.693 A RMS delta B = 6.952 A2 Corr. coeff. = 0.6932 Rotation : 0.999998 -0.002084 0.000292 0.002084 0.999997 -0.000809 -0.000290 0.000809 1.000000 Translation : -0.047 0.077 -0.020 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
A plot of changes in the phi and psi dihedral angles:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- LSQMAN > phipsi m2 a1-199 m3 a1 m2_m3_phipsi.plt Delta-Phi/Delta-Psi plot Plot of M2 A1-199 And M3 A1 Nr of residues matched : ( 137) RMS delta PHI : ( 24.307) Average |delta PHI| : ( 16.926) Nr |delta PHI| > 10 : ( 78) Percentage : ( 56.934) RMS delta PSI : ( 24.066) Average |delta PSI| : ( 15.955) Nr |delta PSI| > 10 : ( 78) Percentage : ( 56.934) Plot file written ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
A plot of the distances between equivalent CA atoms before and after refinement:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- LSQMAN > at ca Nr of atom types : ( 1) LSQMAN > di m2 a1-199 m3 a1 m2_m3_ca_dist.plt Central-atom distance plot Central atom type : ( CA) Plot of M2 A1-199 And M3 A1 Nr of residues matched : ( 137) Average distance : ( 0.346) Minimum distance : ( 0.015) Maximum distance : ( 1.542) Plot file written ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
If you want to see a list of shifts for each CA atom, use the IMprove command:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
LSQMAN > impr m2 * m3 *
Improve fit of M2 *
And M3 *
Atom type | CA |
Nr of atoms in mol1 : ( 137)
Nr of atoms in mol2 : ( 137)
...
Fragment PRO-A 1 <===> PRO-A 1 @ 0.35 A *
ASN-A 2 <===> ASN-A 2 @ 0.39 A *
...
LEU-A 99 <===> LEU-A 99 @ 0.29 A *
LEU-A 100 <===> LEU-A 100 @ 0.11 A *
ALA-A 101 <===> ALA-A 101 @ 0.35 A *
ALA-A 102 <===> ALA-A 102 @ 0.56 A *
ALA-A 103 <===> ALA-A 103 @ 0.80 A *
ALA-A 104 <===> ALA-A 104 @ 0.25 A *
PRO-A 105 <===> PRO-A 105 @ 0.40 A *
ALA-A 106 <===> ALA-A 106 @ 0.04 A *
THR-A 107 <===> THR-A 107 @ 0.11 A *
...
LEU-A 113 <===> LEU-A 113 @ 0.71 A *
THR-A 114 <===> THR-A 114 @ 0.61 A *
ASN-A 115 <===> ASN-A 115 @ 1.52 A *
ASP-A 116 <===> ASP-A 116 @ 0.86 A *
GLY-A 117 <===> GLY-A 117 @ 0.36 A *
GLU-A 118 <===> GLU-A 118 @ 0.36 A *
LEU-A 119 <===> LEU-A 119 @ 0.21 A *
...
ARG-A 136 <===> ARG-A 136 @ 0.62 A *
GLU-A 137 <===> GLU-A 137 @ 1.38 A *
Nr of residues in mol1 : ( 137)
Nr of residues in mol2 : ( 137)
Nr of matched residues : ( 137)
Nr of identical residues : ( 137)
% identical of matched : ( 100.000)
% matched of mol1 : ( 100.000)
% identical of mol1 : ( 100.000)
% matched of mol2 : ( 100.000)
% identical of mol2 : ( 100.000)
LSQMAN > sh m2 m3
Operator bringing : (M3)
on top of : (M2)
Last command was : (IMPR M2 * M3 *)
The 137 atoms have an RMS distance of 0.413 A
SI = RMS * Nmin / Nmatch = 0.41300
MI = (1+Nmatch)/{(1+W*RMS)*(1+Nmin)} = 0.70772
MC = Maiorov-Crippen RHO (0-2) = 0.02895
RMS delta B for matched atoms = 5.909 A2
Corr. coefficient matched atom Bs = 0.654
Rotation : 0.99999309 -0.00336774 -0.00159196
0.00336485 0.99999267 -0.00181351
0.00159806 0.00180814 0.99999708
Translation : -0.1170 0.0981 0.0010
Nr of NCS operators : 1
NCSOP 1 = 0.9999931 0.0033648 0.0015981 -0.117
-0.0033677 0.9999927 0.0018081 0.098
-0.0015920 -0.0018135 0.9999971 0.001
Determinant of rotation matrix 1.000000
Column-vector products (12,13,23) 0.000000 0.000000 0.000000
Crowther Alpha Beta Gamma 0.00000 0.00000 -0.19296
Spherical polars Omega Phi Chi 0.06853 -999.90002 -0.19296
Direction cosines of rotation axis 1.00000 1.00000 1.00000
Dave Smith -0.10360 90.09122 -0.19279
Rotation angle 0.237388
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Maps can be calculated with many programs (including X-PLOR). We will use the CCP4 package here, but you can use any program you like.
Go to the gmrp/ccp4 directory. There is a command file for calculating 2Fo-Fc, Fo-Fc and 3Fo-2Fc maps. Before you can calculate the maps, you have to generate a file which contains the reflections in CCP4 format (MTZ file). Edit and execute the command file "mkmtz.com" to do just that. Subsequently, edit the command file "makemap.com" and execute it to calculate the maps.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- <19 capo.bmc.uu.se gmrp/ccp4> makemap.com SFALL - calculate structure factors Overall Reliability index is 0.2648 RSTATS - scale Fobs and Fcalc Overall Totals: 3816 0.269 0.306 331.818 322.953 121.424 0.027 121.424 0.833 FFT 1 - calculate 2Fo-Fc map Rms deviation from mean density ................. 18.60459 FFT 2 - calculate Fo-Fc map Rms deviation from mean density ................. 15.99642 FFT 3 - calculate 3Fo-2Fc map Rms deviation from mean density ................. 19.70206 EXTEND - cut out 2Fo-Fc map around molecule EXTEND - cut out Fo-Fc map around molecule EXTEND - cut out 3Fo-2Fc map around molecule MAPMAN - mappage 2Fo-Fc, Fo-Fc and 3Fo-2Fc maps around A molecule ... Toodle pip ...real 6.2 user 2.0 sys 0.7 120 -rw-r--r-- 1 gerard 108528 Nov 16 20:27 /nfs/scr_uu1/gerard/scratch/m3.R 1912 -rw-r--r-- 1 gerard 1946144 Nov 16 20:27 /nfs/scr_uu1/gerard/scratch/m3_2fofc.E 1912 -rw-r--r-- 1 gerard 1946144 Nov 16 20:27 /nfs/scr_uu1/gerard/scratch/m3_fofc.E 1912 -rw-r--r-- 1 gerard 1946144 Nov 16 20:28 /nfs/scr_uu1/gerard/scratch/m3_3fo2fc.E 2088 -rw-r--r-- 1 gerard 2129288 Nov 16 20:28 /nfs/scr_uu1/gerard/scratch/m3_2fofc.xE 2088 -rw-r--r-- 1 gerard 2129288 Nov 16 20:28 /nfs/scr_uu1/gerard/scratch/m3_fofc.xE 2088 -rw-r--r-- 1 gerard 2129288 Nov 16 20:28 /nfs/scr_uu1/gerard/scratch/m3_3fo2fc.xE 616 -rw-r--r-- 1 gerard 614912 Nov 16 20:28 /nfs/scr_uu1/gerard/scratch/m3_2fofc.map 616 -rw-r--r-- 1 gerard 614912 Nov 16 20:28 /nfs/scr_uu1/gerard/scratch/m3_fofc.map 616 -rw-r--r-- 1 gerard 614912 Nov 16 20:28 /nfs/scr_uu1/gerard/scratch/m3_3fo2fc.map 18.579u 3.475s 0:38.57 57.1% 1+16k 283+2066io 5379pf+0w <20 capo.bmc.uu.se gmrp/ccp4> cp /nfs/scr_uu1/gerard/scratch/m3_2fofc.map ../o <20 capo.bmc.uu.se gmrp/ccp4> cp /nfs/scr_uu1/gerard/scratch/m3_fofc.map ../o ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Create a new subdirectory gmrp/o/m1m2, move or copy all maps and models etc. of the first two models to this directory and compress the files (include the "maps" and "symmy" macros). Update the "maps" and "symmy" macros in your O directory. If you didn't calculate the maps, move or copy them from gmrp/o/gerard ("m3*.map").
(1) Why are the Ramachandran and chi1/chi2 plots in ProCheck useful, but
not so much the pure geometrical information (bond lengths etc.) ?
(2) Compare the Ramachandran plot of the initial model with that of the
rebuilt model and that of the refined model. Discuss the differences.
Relate your observations to the delta-phi, delta-psi plot of models
M2 and M3.
(3) At what level are you going to contour the Fo-Fc map ? Why is the
"sigma-level" of this map meaningless, unless the map is calculated
on an absolute scale ? What is the unit of electron density on an
absolute scale ?
(4) Why have we still not picked any peaks that might belong to solvent
molecules ?
(5) What is a radial B-factor plot ? What should it look like ?
(6) Why is it a good idea to restrain geometry tightly at worse than
atomic resolution ?
Read your new model M3 into O and apply the usual quality checks to it. Then run OOPS again. Since the model is still crude and incomplete, tell OOPS to generate macros for all residues again. Compare the quality of model M3 to that of the starting model, M1. Set up the symmetry for model M3 inside O.
Check out the density in some of the places where you built new residues or did a substantial rebuild in the first cycle. Note how much the map has improved. Also, there is fairly good density for the ligand now (with one small break). Of course, this dataset is not a "typical" 2.8 Å dataset, since it was derived from a 1.8 Å (synchrotron) dataset by applying a resolution cut-off.
Now rebuild your model using the OOPS macros. Try to put in as many correct sidechains as you can (if they have reasonable density). If you didn't build the entire missing loop, try to do it in this round. Don't forget to save your model regularly (as "m4.pdb"), and to make a backup prior to any major (re)building.
Now it's also time to start paying attention to more detailed issues. For instance, the sidechain of Asn A2 forms a hydrogen bond to the OG of Ser A4. This makes it most likely that the involved atom in the Asn sidechain is OD1. Also check Gln A45 with an eye to hydrogen bonding potential.
Did you notice a negative difference density peak for part of the sidechain of Lys A38 ? If so, rebuild the sidechain of this residue (or cut it back to an alanine).
What do you make of the peptide of Gly A47 ?
How did you (re)build Lys A101 ?
Did you change Thr A110 in any way ? Why ? And Leu A113 ?
When you have finished your rebuild, compare your model M4 to the one in file "gmrp/o/gerard/m4_gerard.pdb". Check the differences using the maps. Save your model and your O database.
Normally, I would wait for the next cycle before putting in the ligand, but since this a tutorial we shall put it in now. Locate the density (two big blobs) for the retinoic acid and jot down the approximate coordinates of the centre-of-gravity of the ligand (a rough estimate suffices; this merely saves you a lot of Move_zone-ing later on). Now exit from O (or use a separate Unix window). Hint: to find coordinates, use Move_atom to place an atom in the spot of interest, then click on the atom to get the coordinates displayed, and hit No to cancel the move.
It is most convenient if you can get a set of starting coordinates for a hetero-entity from elsewhere. If you have access to the Cambridge database of small-molecule structures, that is the first place to look. If not, check out the collection of hetero-entities (extracted from PDB entries) in file "OMAC/hetero.pdb".
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- % 127 gerard rigel 23:01:57 gerard/omac > grep retinoic omac/hetero.pdb COMPND RETINOIC ACID COMPND RETINOIC ACID (ALL-TRANS) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Extract the relevant portion of the file, e.g. using an editor. Store it in a file called "ret.pdb" (use the first occurrence, i.e. NOT the all-trans entry since the latter was taken from the CRABPII structure). The file may look as follows:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- COMPND RETINOIC ACID REMARK Extracted from PDB file 1tyr.pdb REMARK Formula C20 H28 O2 REMARK Nr of non-hydrogen atoms 22 REMARK Residue type REA REMARK Residue name 898 REMARK 2 RESOLUTION. 1.8 ANGSTROMS. 1TYR REMARK Compound also present in : 1FEM 1EPB 1CBR HETATM 1 C1 REA 898 -1.756 2.109 -3.389 1.00 20.00 1TYR HETATM 2 C2 REA 898 -1.817 2.094 -4.924 1.00 20.00 1TYR HETATM 3 C3 REA 898 -1.052 1.062 -5.564 1.00 20.00 1TYR HETATM 4 C4 REA 898 -1.516 -0.335 -5.191 1.00 20.00 1TYR HETATM 5 C5 REA 898 -1.616 -0.496 -3.690 1.00 20.00 1TYR HETATM 6 C6 REA 898 -1.733 0.597 -2.837 1.00 20.00 1TYR HETATM 7 C7 REA 898 -1.998 0.479 -1.370 1.00 20.00 1TYR HETATM 8 C8 REA 898 -1.062 0.135 -0.267 1.00 20.00 1TYR HETATM 9 C9 REA 898 -1.268 -0.041 1.143 1.00 20.00 1TYR HETATM 10 C10 REA 898 -0.170 -0.394 1.867 1.00 20.00 1TYR HETATM 11 C11 REA 898 -0.103 -0.669 3.288 1.00 20.00 1TYR HETATM 12 C12 REA 898 1.022 -1.145 3.887 1.00 20.00 1TYR HETATM 13 C13 REA 898 2.245 -1.499 3.180 1.00 20.00 1TYR HETATM 14 C14 REA 898 3.305 -1.965 3.889 1.00 20.00 1TYR HETATM 15 C15 REA 898 4.061 -1.105 4.787 1.00 20.00 1TYR HETATM 16 C16 REA 898 -2.991 2.890 -2.877 1.00 20.00 1TYR HETATM 17 C17 REA 898 -0.489 2.849 -2.945 1.00 20.00 1TYR HETATM 18 C18 REA 898 -1.448 -1.963 -3.286 1.00 20.00 1TYR HETATM 19 C19 REA 898 -2.704 0.107 1.687 1.00 20.00 1TYR HETATM 20 C20 REA 898 2.202 -1.542 1.606 1.00 20.00 1TYR HETATM 21 O1 REA 898 3.735 -0.926 6.159 1.00 20.00 1TYR HETATM 22 O2 REA 898 5.145 -0.238 4.839 1.00 20.00 1TYR ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Now we shall use MOLEMAN to do a few things. First we will translate the molecule such that its centre-of-gravity is approximately in the right position:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Option ? (READ_pdb_file)Input PDB file ? (in.pdb) ret.pdb Number of lines read : ( 29) Number of atoms now : ( 22) Option ? (READ_pdb_file) trans 1 = Cartesian, 2 = Fractional. Option ? ( 1) Translation vector ? ( 0.000 0.000 0.000) 22 25 21 Nr of atoms translated : ( 22) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Now rename the residue and write the new PDB file:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Option ? (TRANs) resi First NEW residue number ? ( 1) 200 Last residue number : ( 200) Option ? (RESI) chain Chain label (2 characters) ? ( ) B Residue range to apply (0 0 = all molecule) ? ( 0 0) Nr of chain labels updated : ( 22) Option ? (CHAIn) writOutput PDB file ? (out.pdb) ret.pdb ERROR --- XOPXNA - error # 126 while opening NEW file : ret.pdb OPEN : (UNIT= 12 STATUS=NEW CAR_CONTROL=LIST FORM=FORMATTED ACCESS=SEQUENTIAL) Error : (Connection timed out) Open file as OLD (Y/N) ? (N) y REMARK at start of file ? (MoleMan PDB file) Copy all REMARK, HEADER etc. cards from input ? (Y) Which chain to write (** = any and all) ? (**) Residue range to write (0 0 = all molecule) ? ( 0 0) You may output All atoms, only Main-chain atoms, a Poly-alanine (Gly intact), a poly-Serine, (Gly and Ala intact) or a poly-Glycine Which option do you want (All/M/P/S/G) ? (A) Write HYDROGEN atoms (Y/N) ? (N) Force consecutive atom numbering (Y/N) ? (Y) X-PLOR needs OT1 and OT2, but O hates them If your file contains OT1/2 you may either keep them, or replace them by O/OXT Write X-PLOR OT1/2 ? (Y/N) ? (N) Cell : ( 1.000 1.000 1.000 90.000 90.000 90.000) CCP4 requires CRYST, SCALE and ORIGX cards X-PLOR does not like them at all Therefore: reply Y for CCP4 and N for X-PLOR : Write CRYST, SCALE, ORIGX cards (Y/N) ? (Y) n Nr of atoms written : ( 22) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Next, generate some datablocks for use with O:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Option ? (?) torsi Which residue number ? ( 1) 200 Cut-off distance for bonded atoms ? ( 2.000) 1.81 C1 REA 200 20.244 27.109 17.611 1.00 20.00 1TYR 2 C2 REA B 200 20.183 27.094 16.076 1.00 20.00 1TYR ... 22 O2 REA B 200 27.145 24.762 25.839 1.00 20.00 1TYR
Nr of atoms found : ( 22) Residue type : (REA) Atom types : ( C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20 O1 O2) Datablock file ? (torsion_rea.dat) Nr of bonds : ( 22)
DIHEDRAL C6 200 C1 200 C2 200 C3 200 -37.55 Skip -> ring torsion ... DIHEDRAL O2 200 C15 200 C14 200 C13 200 -89.55 Affected atoms : ( C13 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C16 C17 C18 C19 C20) Skip -> too many affected atoms
Nr of unique rotatable torsions : ( 9) Nr of lines written : ( 14) Torsion file written (append to torsion.o) Option ? (TORSi) rsfit Which residue number ? ( 200)
1 C1 REA 200 20.244 27.109 17.611 1.00 20.00 1TYR ... 22 O2 REA B 200 27.145 24.762 25.839 1.00 20.00 1TYR
Nr of atoms found : ( 22) Residue type : (REA) Atom types : ( C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20 O1 O2) Datablock file ? (rsfit_rea.odb) Datablock name : (rsfit_REA) Datablock written ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The torsio