PHASES
PHASES is a package of computer programs designed to compute phase
angles for diffraction data from macromolecular crystals. The package
is complete in that it contains programs for the following: merging
and scaling of native and derivative data sets; analyzing difference
statistics; computing Patterson and electron density maps; searching
for peaks; refining heavy atoms (or protein domains as rigid groups);
computing phases by MIR (multiple isomorphous replacement), SIR
(single isomorphous replacement), SAS (single wavelength anomalous
scattering), SIRAS (single isomorphous replacement supplemented with
anomalous scattering), MIRAS (multiple isomorphous replacement
supplemented with anomalous scattering) or from atomic coordinates for
an input model; noncrystallographic symmetry averaging; combining
phases from a partial structure with MIR etc phases; computation and
analysis of cross difference or Bijvoet difference Fourier maps; and
for phase extension and refinement.
Once an initial set of phases is generated, programs are included
to improve them by carrying out solvent levelling with negative
density truncation and/or combination with model based phase
information and/or averaging over noncrystallographic symmetry.
Solvent levelling is facilitated by the automatic protein-solvent
boundary determination method (Wang, in Methods in Enzymology 115,
1985) which is implemented here entirely in reciprocal space in a much
more efficient manner than in previous programs. If applied to SIR or
SAS starting phases, the programs can also carry out the ISIR or ISAS
phasing procedures described by Wang. The package
consists of 5 major programs and many utility programs as follows:
PROGRAM FUNCTION
PHASIT.F Computes MIR, SIR, MODEL etc phases from
input atomic parameters and diffraction
data. Can refine heavy atoms or derivative
scaling parameters in "phase refinement"
mode.
BNDRY.F Computes coefficients for automatic
boundary determination. Determines
protein-solvent boundary mask, flattens
solvent and applies negative density
truncation, combines phases from external
sources (map inversion or from partial
structures) with original phase information,
extends phases to higher resolution.
FSFOUR.F Space group general 3D FFT program for
electron density calculations.
MAPINV.F Space group general 3D FFT program for
structure factor calculations.
*
MAPVIEW.F Interactive contouring/map viewing program.
Allows user to view maps and masks, and
trace/edit solvent or averaging masks.
CTOUR.F Creates contoured plots from FSFOUR maps,
either as individual sections, mono or
stereo projections. Plots can be viewed
directly or converted to PostScript.
GMAP.F Extracts region from a FSFOUR map and
creates corresponding maps for the graphics
programs TOM, O or CHAIN. Also can create
skeleton files for TOM or O.
MISSNG.F Selects reflections for phase extension.
MRGDF.F Generates coefficients for isomorphous
difference Fourier or cross Fourier.
MRGBDF.F Generates coefficients for Bijvoet
difference Fourier or cross Fourier.
RD31.F Converts internal binary file to ASCII
for examination and/or editing.
MK31B.F Restores ASCII version of file to binary.
PSRCH.F Searches Fourier map and lists unique peaks.
CMBISO.F Combines native and derivative isomorphous
replacement data into one file and scales
the derivative data to the native.
CMBANO.F Combines native data and derivative
anomalous scattering data into one file
and scales the derivative data to the native.
TOPDEL.F Examines isomorphous/anomalous scattering
differences, identifies and rejects outliers,
prepares file for difference Pattersons.
GREF.F Refines heavy atom parameters against
isomorphous or anomalous scattering
differences; refines protein domains,
substructures etc as rigid groups against
native data.
RMHEAVY.F Temporarily removes density in map from heavy
atoms, to aid in accurate solvent mask
generation.
IMPORT.F Allows user to introduce his own phases and
Hendrickson-Lattman coefficients (computed
by external programs) into the PHASES package
for subsequent calculations. This allows one
to bypass the PHASIT program.
XPL_PHI.F Creates input reflection file for XPLORE
from a PHASES style phased file.
*
PRECESS.F Lets one construct and interactively examine
"pseudo" precession or "pseudo" difference
precession photos made from reflection files.
*
VIEWPLT.F Displays up to 10 plots created by CTOUR
on workstation or X-Window capable monitor.
PLTTEK.F Displays plots created by CTOUR on terminals
capable of using TEKTRONIX 4010 emulation.
MKPOST Converts plots created by CTOUR to PostScript.
PDB_CDS.F Converts coordinate files between PDB and
PHASES formats, and vice versa.
EXTRMAP.F Extracts a region (submap) from the standard
FSFOUR map for use in averaging, skewing etc.
EXTRMSK.F Extracts a region (submap) from the standard
solvent mask for possible editing, skewing
etc.
MAPAVG.F Averages one or more maps to impose non-
crystallographic symmetry.
MAPORTH.F Orthogonalizes non-orthogonal map (and
optionally mask) for use in refinement of
noncrystallographic symmetry operator.
LSQROT.F Refines purely rotational noncrystallographic
symmetry operator against electron density.
LSQROTGEN.F Refines general noncrystallographic symmetry
operator (arbitrary rotational angle, with
translation) against electron density.
SKEW.F Skews a map (and optionally a mask) to
a new, and arbitrarily oriented cell.
BLDCEL.F Rebuilds a complete unit cell map (and
optionally, mask) from an input asymmetric
unit submap (and optionally, mask).
MDLMSK.F Creates a mask from coordinates in an input
atomic model, for use in averaging, NC
symmetry operator refinement or use in
solvent flattening.
MRGMSK.F Merges multiple masks created by MDLMSK
into a single mask.
TRNMSK.F Transforms mask created in a "skewed"
cell back to the normal cell.
HNDCHK.F Interpolates density from a map at
specified sites, usually for the purpose
of determining the proper hand.
SLOEXT.F Controls number of iterations and rate of
phase extension to higher resolution.
RDHEAD.F Dumps header from averaging map (submap) or
mask files for examination.
O_TO_SP.F Extracts spherical polar angles and
axis location for use in PHASES from
rotation matrix/translation vector
produced by program "O"
PSTATS.F Tabulates mean phase difference between
two phase sets as a function of d spacing.
TABLE OF CONTENTS
for
GENERAL PROCEDURES
REFERENCING THE PHASES PACKAGE .......... 0.00
GETTING STARTED ......................... 1.00
Accessing on line documentation ....... 1.01
Template scripts and files ............ 1.02
Flow Charts ........................... 1.03
File Formats .......................... 1.04
EXAMPLES ................................ 3.00
Pamfile ............................... 3.01
Initial phasing ....................... 3.02
Solvent levelling ..................... 3.03
Doall scripts ......................... 3.04
Expected output ....................... 3.05
NATIVE, DIFFERENCE AND "CALCULATED"
PATTERSON MAPS .......................... 4.00
REFINING HEAVY ATOM PARAMETERS .......... 5.00
HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE
AND CROSS DIFFERENCE FOURIER MAPS ....... 6.00
CREATING/EDITING SOLVENT MASKS .......... 7.00
INCORPORATION OF PARTIAL STRUCTURES ..... 8.00
REDUCED BIAS NATIVE, COMBINED AND
DIFFERENCE FOURIER MAPS ................. 9.00
INCORPORATION OF NONCRYSTALLOGRAPHIC
SYMMETRY AVERAGING ..................... 10.00
Averaging with Multiple Crystals ..... 10.01
Averaging Difference or 2FO-FC Maps .. 10.02
Sample Input Files for Averaging ..... 10.03
DENSITY MODIFICATION WITH MOLECULAR
REPLACEMENT DERIVED PHASE INFORMATION .. 11.00
PHASE EXTENSION ........................ 12.00
MAD PHASING ............................ 13.00
UNIX SHELL SCRIPTS ..................... 15.00
0.00 REFERENCING THE PHASES PACKAGE
When publishing results obtained from use of the software, a
statement should be included like "all heavy atom refinement, phasing,
solvent flattening, noncrystallographic symmetry averaging, map
calculations etc. (or whatever is appropriate) were carried out with
the PHASES package (Furey & Swaminathan, 1995)." This refers to the
following paper:
"PHASES-95: A Program Package for the Processing and Analysis of
Diffraction Data from Macromolecules", W. Furey & S. Swaminathan,
in MACROMOLECULAR CRYSTALLOGRAPHY, a volume of Methods in Enzymology,
eds. C. Carter & R. Sweet, Academic Press, Orlando, Fl. (1996), in
press.
1.00 GETTING STARTED
The first thing to do is to prepare an input parameter file
specifying the cell constants, symmetry information etc. This file is
referred to as the "standard parameter file" throughout the PHASES
package, and is often called "PAMFIL" generically in specific program
writeups. One should select a name for it which is indicative of the
particular structure being worked on, and rapidly communicates to the
user that it is a parameter file. For example, PDC.PAM might be a good
choice for phasing pyruvate decarboxylase. The main purpose of this
file is to insure consistency in cell constants, symmetry, lattice
type etc throughout all programs, and to eliminate redundant input of
these parameters by the user. In addition one can optionally specify
the name of a "running log file." If this is done then in addition to
normal output to either the screen or individual log files for each
program, a copy of all printed output is also appended to a single
file, preceeded by a time stamp indicating what program was run and
when. Thus one can maintain a complete history of all computations
and results in a single log file.
Each standard parameter file should contain the following
information in the indicated sequence.
LOGFILE=FILNAME Where FILENAME is the name of the
desired "running" log file. If no
cumulative log is desired, enter
LOGFILE=NULL
There must be no spaces immediately
preceeding or following the "=". Upper
or lower case is permitted.
LATTICE=X Where "X" is either P,A,B,C,I,F or R
There must be no spaces immediately
preceeding or following the "=". Upper
or lower case is permitted for the word
LATTICE, but only UPPER case for the
single character symbol.
A, B, C, ALPHA, BETA, GAMMA Unit cell constants, in angstroms and
degrees. Readable in free format, i.e.
at least one blank or comma separating
entries.
NSYM Number of equivalent positions in
the space group. Do NOT include
additional translations associated
with centering conditions for
non-primitive lattices, i.e. for
space group C2 NSYM=2. (this entry
read in free format).
The NSYM symmetry operators follow, one operator per line EXACTLY
as indicated in the International Tables for X-Ray Crystallography.
The first operator should ALWAYS be X,Y,Z. Note that for rhombohedral
lattices the HEXAGONAL CELL AND SYMMETRY OPERATORS SHOULD BE USED,
along with the lattice type R.
The following sample serves as a complete template for a parameter
file, for space group P2(1)2(1)2(1)
LOGFILE=seb.rlog
LATTICE=P
45.331 68.33 79.62 90. 90. 90.
4
X,Y,Z
1/2-X,-Y,1/2+Z
1/2+X,1/2-Y,-Z
-X,1/2+Y,1/2-Z
Once a suitable parameter file is created, the phasing process can
begin.
One starts phasing by preparing one or more "scaled" or "merged"
files containing x-ray diffraction data. The files will vary
depending on whether isomorphous replacement or anomalous scattering
data is to be used for phasing. Each file should be ASCII (read in
free format) with all records containing the same type of information.
Each record should contain
H, K, L, FP, Sig(FP), FPH, Sig(FPH) for isomorphous replacement
or
H, K, L, F+, Sig(F+), F-, Sig(F-) for native anomalous scattering
or
H, K, L, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-)
for derivative anomalous scattering
where
H, K, L = Miller indices (integers).
FP, FPH = Native and Derivative structure factor amplitudes
F+, F- = Structure factor amplitudes for reflection. F+
corresponds to indices H, K, L, F- to -H,-K,-L.
FPH+, FPH- = Derivative structure factor amplitudes. FPH+ corresponds
to indices H, K, L, FPH- to -H, -K, -L.
Sig(X) = Estimated standard deviation for quantity X.
A separate file should be prepared for each derivative/anomalous
scattering data set. For isomorphous replacement and derivative
anomalous scattering data the FPH values should have already been
properly scaled to the FP values. If more than one data set is to be
used for phasing, then ALL F VALUES SHOULD BE ON THE SAME SCALE.
Indeed, for MIR phasing it is best to keep corresponding FP values
IDENTICAL in each data set. The "scaled/merged" files are usually
prepared by the programs CMBISO or CMBANO and are generally given
filenames ending in ".scl", but they can also be generated externally
by the user. It is always desirable however, to use the ".scl"
ending as some of the programs in PHASES will deduce the file format
from the ending of the filename. Once these files are prepared, they
can be used to create difference Pattersons to identify heavy atom
sites. A control file containing heavy atom parameters (either for
the derivative, or anomalous scatterers) must then be prepared, and
GREF or PHASIT can be run. If PHASIT is simply used to compute
structure factors from a model, then the ".scl" files are not needed,
but reflection and coordinate files must still be supplied. One
can use the phase file output from PHASIT directly to compute an
electron density map with FSFOUR, or the file can be used with
programs BNDRY, FSFOUR and MAPINV to carry out solvent levelling,
negative density truncation, phase extension and phase combination
analogous to Wang's ISIR procedure. If the latter is selected, then
the file output from PHASIT should be named "phasit.31"
In general, programs CMBISO and/or CMBANO are used to prepare all
reflection data files. Then TOPDEL is run to reject outliers and
select data for difference Patterson calculations to be performed by
FSFOUR. The Patterson map can be interactively contoured and examined
in MAPVIEW, searched for peaks in PSRCH, or contoured to generate
hard copies as PostScript files with CTOUR and MKPOST. Once heavy atom
locations are identified, they can be refined by GREF or PHASIT. The
heavy atom parameters and data are then used in PHASIT to compute SIR,
MIR phases etc. Next MISSNG is run followed by the solvent
flattening/negative density truncation/phase extension iterations
carried out by BNDRY and invoked by the procedure DOALL. If more than
one derivative is needed, or one wants to search for additional heavy
atom sites, programs MRGDF and/or MRGBDF can then be used to create
difference or cross difference coefficients, FSFOUR computes the map,
and MAPVIEW, PSRCH or CTOUR are used to identify peaks again. One can
also use the difference coefficients files produced by PHASIT to
compute "double difference" type maps to search for minor sites. The
new heavy atom parameters are then included in PHASIT, and the process
is repeated. This procedure can be cycled over as many derivatives
or data sets as needed. As a final step, it is often useful to hold
the "solvent flattened" phases fixed in PHASIT, and refine the heavy
atom parameters again. This final set of heavy atom parameters is then
used to compute final MIR etc phases in PHASIT, which are then used to
start a final round of solvent flattening. The final map resulting
from these phases can be interactively contoured and examined in
MAPVIEW, converted to graphics map format (e.g. for TOM, O or CHAIN)
and skeletonized by GMAP, or hard copies can be prepared by CTOUR and
MKPOST.
1.01 ACCESSING ON LINE DOCUMENTATION
The complete PHASES manual (what you are reading now) is maintained
online in the file PHASES.WUP. This file generally resides in the top
level of the PHASES directory, which initially is a subdirectory under
"export" (on UNIX systems), but its location may vary depending on how
one installs the software. On OpenVMS systems it can be accessed by
referring to PHASES_DOC:PHASES.WUP (if one installs the software as
described later). It is recommended that each user make a copy of the
manual in his own working directory so it can be examined without
fear of destroying the original. The manual is a simple ASCII text
file and can be examined in the editor of your choice. All program
write-ups begin with the program name followed by a single space and
then by the word "WRITE-UP" (all in uppercase), so that, for example,
to get to the write-up for program FSFOUR one can simply enter an
editor and search for "FSFOUR WRITE-UP" or just "FSFOUR W". This will
position the editor at the appropriate place in the manual. Just be
sure to exit the editor without making any changes. Indeed, it may be
desirable to set the file protection so that it can be read but not
written.
1.02 TEMPLATE SCRIPTS AND FILES
Included with the PHASES distribution are a series of sample control
files (*.sh or *.com files) as well as sample input data (*.d or *.dat
files). As initially distributed, these files reside in the top level
of the PHASES directory (itself a subdirectory under "export" on UNIX
systems, or in PHASES_TEMPL, if installed as suggested in the "VMS
USER INFORMATION" section). The "*.sh" files are UNIX shell scripts
to invoke one or more programs, while the "*.com" files accomplish the
same tasks under OpenVms. Similarly, the "*.d" and "*.dat" files are
sample data inputs for programs under the UNIX and OpenVms operating
systems, respectively. Generally the "*.d" and "*.dat" files are
identical. It is suggested that each user copy these files to his
working directory to serve as templates for new applications. This
will minimize the possibility of typing errors, and also serve as an
example for a particular calculation. Indeed, it may be desirable to
open two windows, one editing the template file and the other
positioned to examine the appropriate write-up as described in the
preceeding section.
1.03 FLOW CHARTS
Native file Derivative file
. .
. .
v v
************************************
* CMBISO or CMBANO *
************************************
.
.
.
"Scaled/Merged" file
.
.
v
*****************
* TOPDEL *
*****************
.
.
"Patt" file
.
.
v
*****************
* FSFOUR *
*****************
.
.
"Map" file
.
.
v
....................................................
. . .
. . .
v v v
**************** ****************** *************
* PSRCH * * MAPVIEW * * CTOUR *
**************** ****************** *************
Path for initial processing of a derivative data set, includes
merging and scaling native and derivative data, rejecting outliers,
computing difference Patterson maps and examination.
"Scaled/Merged" file(s)
.
.
v
*************** "Phased" file,
* PHASIT * from BNDRY after
*************** solvent flattening
. . .
. . .
. "Phased" file .
. . .
. . .
. v v
. ...................................
. .
. .
. v
. *********************
. * MRGDF or MRGBDF *X-- "Scaled/Merged
. ********************* file"
"difference file" .
. .
. "Cross phase" file
. .
. .
. v
. *********************
..............X* FSFOUR *
*********************
.
.
"Map" file
.
.
v
....................................................
. . .
. . .
v v v
**************** ****************** *************
* PSRCH * * MAPVIEW * * CTOUR *
**************** ****************** *************
Paths for generating and examining "cross difference" Fourier,
"cross Bijvoet difference" Fourier, "double difference" Fourier
or "double Bijvoet difference" Fourier, started either by
generating SIR, MIR etc phases, or using "solvent flattened"
phases.
Native file
.
.
v
***************** "Phased" ***************
----------* PHASIT * -----------------X* MISSNG *
. ***************** file . ***************
. . .
. . .
. . "Extension" file
. . .
. . .
Partial . .
structure ***************** . .
file * FSFOUR *X------ . .
. ***************** . . .
. . ^ ----- .
. . . . .
. . . -------- .
. . . . .
. . . . .
. v . v .
. ***************** .
---------X* BNDRY *X-------------------------
*****************
. ^
. .
. .
v .
*****************
* MAPINV *
*****************
Path for solvent flattening process, as implemented in the "doall"
procedure. Starts with SIR, MIR etc phases and includes mask
generation, solvent flattening and phase combination iterations. The
leftmost and rightmost branches are optional, for inclusion of partial
structure information and phase extension, respectively. The FSFOUR-
BNDRY-MAPINV loop performs the iterations. The PHASIT output
is fed directly to FSFOUR only during the initial pass, to generate
the first map. In all passes it is fed to BNDRY to serve as the
"anchor" phases in the phase combination step.
************ *************
* FSFOUR *------------X* EXTRMAP *
************ *************
^ ^ .
. . .
. . v
************** . . *************
* PHASIT *------- . * MAPAVG *X----
************** . . ************* .
. . . .
. . . "Envelope"
. . . Mask
v . v .
************ ************* .
"Extension"-----X* BNDRY *X------------* BLDCEL *X----
file ************ *************
^ .
. .
. v
************
* MAPINV *
************
Path followed during solvent flattening iterations modified to
include noncrystallographic symmetry averaging. The PHASIT output
is fed directly to FSFOUR only during the initial pass, to generate
the first map. In all passes it is fed to BNDRY to serve as the
"anchor" phases in the phase combination step. The "extension" file
is optional, and is used for phase extension only.
1.04 FILE FORMATS
Most of the programs in the PHASES package utilize the same internal
file formats, choosen for combinations of simplicity and efficiency.
The major files used are now described.
1) "Input" files. Entering data initially into the package assumes one
can prepare reflection files either in free format, as XENGEN-like
"MULISTS", or as "SCALEPACK" style files. Thus input structure factor
files can have any of the following record formats.
FREE FORMAT i.e.
h, k, l, F, sig(F) (ASCII, read in free format)
The free format input file is generally assumed in the programs if
the filename ends in ".DAT" or ".dat", and sometimes will be assumed
if no other file type is deduced from the ending of the filename. The
"free format" implies that the values in each record are separated by
at least one blank space or a comma.
or
XENGEN like "MULIST" i.e.
h, k, l, res, F, sig(F), F+, sig(F+), F-, sig(F-), iflag
in format ( 3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 )
The "iflag" status flag is optional. If present, it will be
used to screen for viable anomalous scattering data. If absent,
only values with F+ and F- greater than zero will be used when
anomalous data is needed. MULIST format is generally assummed
within the programs if the filename ends in ".MU" or ".mu".
or
SCALEPACK style files i.e. the file starts with a variable
number of header records, the total number of which is given by
one plus 2 times the number given in the first header record
(format I5). After the header records (usually 3) the data
follows as individual records containing
h, k, l, I+, sig(I+), I-, sig(I-)
in format ( 3I4, 4F8.0 )
Note that unlike the other formats, for SCALEPACK files
INTENSITIES and their standard deviations are given instead
of AMPLITUDES. Also, the files need not contain Bijvoet pair
data as the last two items may be missing, as would be the
case if the data were reduced treating Freidel mates as
equivalent. SCALEPACK format is generally assumed within the
programs if the filename ends in ".SCA" or ".sca".
2) "Scaled" (and merged) structure factor files.
These files are produced by CMBISO or CMBANO, starting with input
files of type (1). The files are ASCII, with each record containing
h, k, l, FP, sig(FP), FPH, sig(FPH)
or
h, k, l, FP+, sig(FP+), FP-, sig(FP-)
or
h, k, l, FP, sig(FP), FPH+, sig(FPH+), FPH-, sig(FPH-)
in format ( 3I4, 6F10.2)
for either isomorphous, native anomalous or derivative anomalous
data sets, respectively. SCALED files are generally assumed
within the programs if the filename ends in ".SCL" or ".scl".
3) "Phased" structure factor files. These files are produced mainly
by PHASIT and BNDRY, but can also be generated by other programs.
There are two types of "phased" files, depending on whether or not
probability distributions are available. Both types of files are
BINARY, but can be converted to ASCII by the utility program RD31.
The first type, the normal or "long" format has records containing
h, k, l, FOM*FO, FO, PHIbest, A_B, C_D, MK, FOM
where h, k, l, A_B, C_D and MK are INTEGERS, and the others REALS
The Hendrickson-Lattman probability distribution coefficients are
packed two per word, in the A_B and C_D entries according to
A_B = ( IFIX(A*100) + 16384 )*32768 + IFIX(B*100) + 16384
C_D = ( IFIX(C*100) + 16384 )*32768 + IFIX(D*100) + 16384
FO is the observed protein structure factor amplitude, PHIbest
the "best" (centroid) phase in degrees, and FOM the associated figure
of merit. MK is the restricted phase indicator, such that if MK=1 there
are no restrictions. If MK > 1, then the reflection is centric, with
one of the allowed phases given by 15*(MK-1), and the other 180
degrees away from it.
There is also an alternate version of the "long format" phase file,
obtained only from running option 3 of BNDRY with IOTYP=1, which
has FO and FC replacing FOM*FO and FO in the records. This file type
is used ONLY if one wants to do solvent flattening and/or NC symmetry
averaging iterations on DIFFERENCE or 2FO-FC MAPS. Its usage is
explained elsewhere in the documentation.
The second type, or "short" format has records containing
h, k, l, FO, FC, PHI
where FC is the "calculated" structure factor amplitude, as typically
computed from input coordinates for a model in PHASIT, GREF, or output
from the map inversion program MAPINV.
Note that the Fourier program FSFOUR only reads the first six entries
in a record, so that in general, EITHER type of "phased" file can be
used for map calculations. However, some map types might be accessible
with only one of the formats (e.g. difference maps). Both long and
short format PHASED files are generally recognized within the programs
if the filename ends in ".31".
4) "Mask" files. These files are binary, with the same record format
applying both to "solvent masks" and "averaging masks." The file
starts with a header record containing
A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with the first 6 values REAL*4, the next 9 INTEGER*4, the lengths in
Angstroms and the angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 BYTE values) along X,
starting at IXMN. Y is slowest varying, i.e. the file could have
been created with the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(MSK(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
Note that the mask entries are FORTRAN type BYTE (INTEGER*1).
For solvent masks, the entries will either be 0 (protein) or 2
(solvent). For averaging masks only the values 0, 10, 20, 30, 40
etc are meaningful as they indicate the grid point is inside the
primary envelope for molecules 1, 2, etc. The masks can be displayed
with program MAPVIEW, and program RDHEAD can be used to list the
header record.
5) "FSFOUR" maps. These maps are produced by FSFOUR (and BLDCEL).
They are binary, and contain a variable number of header records
followed by the map. The map ALWAYS covers one full cell. See FSFOUR
write-up (and possibly examine the program source) for further
details.
6) "Submaps" Also referred to as "averaging" maps. These map files
are binary, with the same header and record structure as "mask" files,
except that the density values are written as FORTRAN type REAL instead
of mask values. They are usually prepared by MAPVIEW or EXTRMAP, but
can be generated by MAPORTH, SKEW, TRNMSK etc. Note that RDHEAD can
also be used to list the header record.
7) "Extension" file. Used for phase extension, and created by program
MISSNG. This file is ASCII, and contains a list of reflection indices,
Fobs and phase probability distribution coefficients, for reflections
absent on the main "phased" file, but for which native amplitudes and
possibly phase probability distribution coefficients are available.
It is used only for phase extension. The records simply contain
h, k, l, Fobs, A_B, C_D
in format ( 3I4, F10.2, 2I12 ) where the distribution coefficients
are packed as in a normal phased file. If no distribution coefficients
are available the A_B and C_D values are zero.
3.00 EXAMPLES
This section contains samples of input for various programs and
procedures. In general, template files containing these examples
are also provided along with the programs on the distribution media.
In some cases (for example, solvent levelling), some practical
considerations are also discussed.
3.01 ***** SAMPLE INPUT PARAMETER FILE *****
LOGFILE=seb.log
LATTICE=P
45.33 68.33 79.62 90. 90. 90.
4
X,Y,Z
1/2-X,-Y,1/2+Z
1/2+X,1/2-Y,-Z
-X,1/2+Y,1/2-Z
3.02 ***** SAMPLE INPUT DECKS FOR PHASIT *****
EXAMPLE I
The deck below will compute SIR phases from a single isomorphous
replacement derivative data set. The resulting phase file can
then be used in the procedure DOALL to carry out Wang's ISIR
process, or in MRGDF or MRGBDF to solve new derivatives or look
for additional sites. The "difference coefficients" file can
be used to compute "observed" and "calculated" difference
Pattersons, heavy atom difference maps or heavy atom "double
difference" maps to find new sites.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
1 0 1
phasit.31
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
EXAMPLE II
The deck below can be used to compute SIRAS phases from isomorphous and
anomalous scattering data from a single derivative. The resulting file
can then be used directly for map computation; used in the procedure
DOALL to carry out solvent flattening/negative density truncation,
phase extension etc, starting with (and tying to) the SIRAS phases; or
in MRGDF or MRGBDF to solve new derivatives or look for additional
sites. The "difference coefficients" files can be used to compute
"observed" and "calculated" difference Pattersons, heavy atom
difference maps or heavy atom "double difference" maps to find new
sites.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
2 0 1
phasit.31
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
DIAMINO DICHLORO PT (DERIVATIVE ANOMALOUS DISPERSION DATA )
monoptano.scl
pt_ano_diff.31
4. 6. 2 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
EXAMPLE III
The deck below assumes isomorphous replacement data is available for
two derivatives, and 5 passes of phase refinement, each consisting
of 3 cycles for each derivative will be done to refine nearly all
possible derivative parameters (except B's), i.e. MIR phases will be
computed and refined. The resulting file can then be used directly for
map computation; used in the procedure DOALL to carry out solvent
flattening/negative density truncation, phase extension etc, starting
with (and tying to) the MIR phases; or in MRGDF or MRGBDF to solve new
derivatives or look for additional sites. The "difference
coefficients" file can be used to compute "observed" and "calculated"
difference Pattersons, heavy atom difference maps or heavy atom
"double difference" maps to find new sites.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
2 1 1
phasit.31
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
HGCL2 ( ISOMORPHOUS REPLACEMENT DATA )
monohg.scl
hg_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
2
HG1 0.3639 0.2218 0.1776 20. 1.000 7
HG2 0.4454 0.0939 0.2878 20. 0.800 7
5 0.2 6 2 1 0 1
1 SET 1
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 0
1 1 1 1 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 0
1 1 1 1 0
1 1 1
2 SET 2
0 0 0 0 0
0 0 0 0 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
EXAMPLE IV
Similar to example III, except that one of the temperature factors is
converted to anisotropic and is also refined, with the isotropic
equivalent restrained to its original value.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
2 1 1
phasit.311
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 -20. 0.664 6
0. 0. 0. 0. 0. 0. 20.
0.5
PT3 0.5474 0.0523 0.6964 50. 0.450 6
HGCL2 ( ISOMORPHOUS REPLACEMENT DATA )
monohg.scl
hg_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
2
HG1 0.3639 0.2218 0.1776 20. 1.000 7
HG2 0.4454 0.0939 0.2878 20. 0.800 7
5 0.2 6 2 1 0 1
1 SET 1
0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 1 1 1 1 1 1
1 1 1 1 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 1 1 1 1 1 1
1 1 1 1 0
1 1 1
2 SET 2
0 0 0 0 0
0 0 0 0 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
For all examples, PHASIT can be run with the following control
information.
For UNIX, use the following in a shell script
phasit < phasit.d > phasit.l
3.03 ***** SAMPLE INPUTS FOR PHASING BY SOLVENT LEVELING *****
A complete solvent flattening run can be executed by creating a few
small data files, and running the procedure DOALL. This will carry out
the complete sequence of protein-solvent boundary determination,
solvent flattening, and phase combination steps, in a manner
equivalent to that suggested by Wang in his ISIR process, although the
initial phases can be SIR, SAS, MIR, MIRAS or any combination
generated by PHASIT. It will generate an initial solvent mask, use it
for 4 cycles of solvent flattening/ phase combination, create a new
mask, use it for 4 cycles, create a third mask, use it for 8 cycles,
and, if desired, do additional phase extension cycles, and then
possibly phase AND AMPLITUDE extension cycles.
A series of files, all given .d extensions (UNIX) or .dat
extensions (VMS) should be created containing control information for
the forward and inverse Fourier transforms, for each option of the
BNDRY program and for RMHEAVY. In general, these are the only files
which will have to be changed for a new application, PROVIDED THE FILE
NAME CONVENTION IN THE CONTROL FILES IS ADHERED TO. The output from
PHASIT should be called phasit.31 and if phase extension is to be done,
then the output from MISSNG should be called extrfl.d and the file
sloext.d should also be prepared. If phase extension is not desired,
then one does not have to run MISSNG and create sloext.d, but
the line invoking the "extnd" procedure (@EXTND.COM in DOALL.COM for
VMS systems or sh extnd.sh in doall.sh for UNIX systems) should be
commented out. The individual program writeups should be consulted
for the meaning of the parameters. It is important that the grid
spacing selected in the input to FSFOUR be appropriate for the highest
resolution data to be used anywhere in the process, including phase
extended reflections. A grid spacing of about 1/3 of the smallest d
spacing is recommended. It is also VERY important that the index range
requested in the inputs to MAPINV cover at least a complete asymmetric
unit out to the maximum resolution to be used anywhere in the process,
including phase extended reflections. The particular asymmetric unit
covered need not be identical to that originally input implicitly to
PHASIT via the reflection files, but all reflections in the input
files should at least have symmetry related counterparts in the MAPINV
asymmetric unit. Since index limits in MAPINV are restricted to
minimum and maximum values along each reciprocal axis, in high
symmetry systems it may be necessary to cover more than an asymmetric
unit (this causes no problem).
Note also that MAPINV can compute structure factors only in the
hemisphere with L non-negative, thus one MUST request an asymmetric
unit in this hemisphere. This also creates no problem SINCE ANY
REFLECTION CAN ALWAYS BE RELATED TO ONE IN THIS HEMISPHERE BY
application of the Friedel symmetry operator, and this is
automatically done in the programs. THUS WHEN IN DOUBT, ONE CAN
ALWAYS SPECIFY A FULL HEMISPHERE, I.E. A RANGE OF -HMAX,HMAX,
-KMAX,KMAX AND 0,LMAX WHICH WILL WORK, but may not be the most
efficient way of doing things. For this reason one will NEVER have
to reindex the input data, as an appropriate range in MAPINV can
ALWAYS BE GIVEN!
Example inputs are now given. If the supplied doall and related
scripts are to be used without modification, then the filenames in
these samples should NOT be changed (except for the parameter file, of
course). One need change only the parameter file, solvent content and
resolution related parameters, the map periods and index range, and
the heavy atom coordinate file.
---- file fft.d (input to FSFOUR, for map calculation)----------------
seb.pam
COMPUTE ELECTRON DENSITY MAP
0 48 72 80 1 0 20 0 0 0 0.
four.ref
four.map
--- file minv1.d (input to MAPINV, for solvent boundary determination)
seb.pam
INVERT ELECTRON DENSITY MAP AFTER TRUNCATING NEGATIVES
four.map
minv.ref
0 0 0 16 0 24 27
0. 0. 1 0
--- file minv2.d (input to MAPINV, for normal map inversion) -----
seb.pam
INVERT ELECTRON DENSITY MAP AFTER SOLVENT FLATTENING
mod.map
minv.ref
0 0 0 16 0 24 27
0. 0. 0 0
--- file rmhv.d (input to RMHEAVY, for removal of heavy atoms ) ----
seb.pam
four.map
nohv.map
2 2.5
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
--- file bnd0.d (input to BNDRY, option 0, prepare SF for protein-
solvent boundary determination )------------------
seb.pam
0
9.
minv.ref
four.ref
--- file bnd1.d (input to BNDRY, option 1, create solvent mask)----
seb.pam
1
four.map
mask.map
.4
--- file bnd2.d (input to BNDRY, option 2, do solvent flattening and
negative density truncation) ----------------
seb.pam
2
four.map
mask.map
mod.map
.086
--- file bnd3.d (input to BNDRY, option 3, combine new phases with
original) --------
seb.pam
3
0 0. 1. 0 0
phasit.31
minv.ref
newphi.ref
--- file extnd.d (input to BNDRY, combine new phases with original,
including phase extension ) ------------
seb.pam
3
1 3.5 1. 0 0
phasit.31
minv.ref
extrfl.d
newphi.ref
--- file extnda.d (input to BNDRY, combine new phases with original,
including phase AND AMPLITUDE extension) --------
seb.pam
3
2 3.5 1. 0 0
phasit.31
minv.ref
extrfl.d
newphi.ref
--- file sloext.d (controls range and rate of phase extension) --
seb.pam
4. 3.5 8
extnd.d
--- file sloext2.d (controls range and rate of phase AND AMPLITUDE
extension --------
seb.pam
4. 3.5 8
extnda.d
Once the input is prepared, the phasing process can be carried out
either by running a series of command procedures as individual steps,
or by running a single command procedure which invokes all others.
The single procedure, called doall.sh or doall.com follows. In the
procedures that follow, it is assumed that phase extension will be
carried out, and that the additional files "extrfl.d" (prepared by
MISSNG) and "sloext.d" (see SLOEXT write-up) are available.
3.04 For UNIX, use the following commands in a shell script,
called doall.sh
# COMMAND PROCEDURE TO CARRY OUT THE ENTIRE CYCLING PROCESS FOR PHASING
# DATA BY SOLVENT LEVELLING
#
# COMPUTE THE FIRST SOLVENT MASK
sh mask1.sh
#
# COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING FIRST MASK)
sh cycle4.sh
#
# COMPUTE THE SECOND SOLVENT MASK
sh mask2.sh
#
# COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING SECOND MASK)
sh cycle8.sh
#
# COMPUTE THE THIRD SOLVENT MASK
sh mask3.sh
#
# COMPUTE 8 CYCLES OF SOLVENT LEVELLING (USING THE THIRD MASK)
sh cycle16.sh
#
# DO ADDITIONAL CYCLES OF SLOW PHASE EXTENSION (TO REFLECTIONS WITH
# NATIVE AMPLITUDES SUPPLIED), EITHER TO HIGHER RESOLUTION OR TO
# INITIAL RESOLUTION
sh extnd.sh
#
# IF DESIRED, DO ADDITIONAL CYCLES OF PHASE EXTENSION (INCLUDING
# DATA FOR WHICH THERE IS NO AMPLITUDE INFORMATION). THIS OPTION IS
# NOT ALWAYS DESIRABLE, THUS IT IS COMMENTED OUT. TO INVOKE IT, SIMPLY
# REMOVE THE # FROM THE FOLLOWING LINE
#sh extnda.sh
#
# THATS ALL
3.05 EXPECTED OUTPUT FILES
Execution of the "doall" procedure will result in the following
files being present. (phasit.31 and phasit.log should be present prior
to running "doall.")
phasit.31 contains original MIR, SIR etc phases from PHASIT
phasit.l contains phasit printed output
mask1.14 contains first solvent mask
mask1.l contains mask1 printed output
phi4cy.31 contains phases after 4 cycles using first mask
cycle4.l contains printed output from first 4 cycles
mask2.14 contains second solvent mask
mask2.l contains mask2 printed output
phi8cy.31 contains phases after 4 cycles using second mask
cycle8.31 contains printed output from next 4 cycles
mask3.14 contains third solvent mask
mask3.l contains mask3 printed output
phi16cy.31 contains phases after 8 cycles using third mask
cycle16.l contains printed output from next 8 cycles
phiextnd.31 (if generated) contains phases after 8 cycles using
third mask, plus additional cycles of phase extension
to known amplitudes.
extnd.l (if generated) contains printed output from next 12
cycles
phiextnda.31 (if generated), contains phases after 8 cycles using
third mask, plus additional cycles of phase extension
to known amplitudes, plus additional cycles of phase
and amplitude extension.
extnda.l (if generated), contains printed output from next 12
cycles.
cycles.
4.00 NATIVE, DIFFERENCE AND "CALCULATED" PATTERSON MAPS
In protein crystallography one is generally interested in difference
Patterson maps to locate heavy atoms, in which the Fourier coefficients
are the squares of the DIFFERENCE in AMPLITUDES between native and
derivative data, or between members of a Bijvoet pair. Sometimes
however, it is useful to compute native Patterson maps, or to compute
"calculated" Patterson maps (generated from intensities computed
explicitly from an input atomic model). The native maps may provide
information about non-crystallographic symmetry, while the "calculated"
maps obtained from a tentative heavy atom structure can be compared
with the observed difference Pattersons to see how well the major
features are being explained. The latter method is particularly
useful in high symmetry systems, where even a small number of heavy
atom sites gives rise to many Patterson peaks. Examining the observed
and calculated Pattersons side by side (perhaps in VIEWPLT) can then
provide confidence in the heavy atom interpretation.
DIFFERENCE PATTERSONS - Difference Pattersons (either isomorphous or
anomalous) can be computed by two different routes in PHASES. The
first approach is to generate a standard "phased" file containing
h,k,l,Fo,Fc,Phi, and use it in FSFOUR with the MAPTYP=5 option.
Generally programs CMBISO or CMBANO do the initial data preparation,
and their output files are then fed to TOPDEL to select the data
according to various criteria, screen for and reject outliers,
and write the appropriate information to the output file for FSFOUR.
The output file will then contain either FPH and FP, or F+ and F- in
the amplitude slots, depending on whether isomorphous or anomalous
data were input. The second approach, which is useful only after
at least one site is found in the derivative, is to use the
"difference coefficient" file output from PHASIT in FSFOUR with
MAPTYP=6. In the isomorphous case if the input site(s) are correct
this should lead to a cleaner map, since the FPH to FP scale factor
has been refined, and also because the angular difference between
the FP and FPH vectors are compensated for. The FO and FC slots in
the file then contain (FPH-FP)obs,corrected and FHcal, respectively.
For anomalous data these slots contain (FPH+ - FPH-)obs and
(FPH+ - FPH-)calc or their counterparts for native anomalous data.
NATIVE PATTERSONS - Native Patterson maps can be generated in several
ways, depending on what information is currently available. In all
cases one must prepare a standard input "phased" file containing
h,k,l,Fo,Fc,Phi, and request the appropriate option in FSFOUR to create
the desired coefficients from the input data. One way to do this is
to run CMBISO inputting the native file twice (as both the native and
derivative data sets), and then run TOPDEL selecting ALL coefficients
to be output (you can still use d and F/sigma cutoffs, but output
100% of the data!). The R factor and all differences will of course,
be zero, but the output file will contain native amplitudes in both
the Fo and Fc slots, and thus the native Patterson can be generated
by requesting MAPTYP=6 in FSFOUR. Another approach would be to run
PHASIT, SF mode with IHLCF=0 and ISIGA=0, using a single "dummy" atom
arbitrarily positioned as the model. The output file will then give a
bad R factor, but it will contain Fo and Fc in the amplitude slots,
and selecting MAPTYP=6 in FSFOUR will again give the desired native
Patterson. The first method allows one to use d spacing and F/sigma
cutoffs, while the second always uses all of the data. In either case
the native Pattersons can be searched for peaks, contoured, displayed,
printed etc. with PSRCH, MAPVIEW, CTOUR, MKPOST etc.
"CALCULATED PATTERSONS" - Patterson maps corresponding to an input
atomic model are also generated by preparing the normal "phased" file
containing h,k,l,Fo,Fc,Phi, and by selecting the appropriate
coefficient option (MAPTYP=7) in FSFOUR. In this case it is important
that the second amplitude slot truely contains Fc. One way to do this
is to run PHASIT, SF mode with IHLCF=0 and ISIGA=0, and to include
all of the desired atoms in the model. If a heavy atom model is
used as the input, the R factor will be meaningless (since scaling
is to the NATIVE amplitudes rather than differences), but the output
file would still be appropriate for the "calculated" difference
Patterson as the map scale is arbitrary anyway. Another way is to use
GREF to prepare the file, by requesting that an output Fourier file
be written. In that case the file created can contain the proper
DIFFERENCE amplitude in the Fo slot, and the model based Fc in the
Fc slot. Then the SAME file could be used in FSFOUR to create both
the observed difference Patterson (MAPTYP=6) and the modeled version
of it based on the heavy atoms (MAPTYP=7). Once again, the FSFOUR map
can be searched for peaks, contoured etc. as any normal map. Finally,
the "difference coefficients" file written by PHASIT can be used in
FSFOUR with MAPTYP=7 to compute the "calculated" difference Patterson
based on the input heavy atom model. The advantage of doing it this
way is that the model, and hence the FC's, then would reflect all
refined scaling parameters (possibly including anisotropic B's), and
also models based solely on anomalous scatterers could be used.
5.00 REFINING HEAVY ATOM PARAMETERS
There are two general ways to refine heavy atom parameters within
the PHASES package: refinement against isomorphous or anomalous
amplitude differences; or "phase refinement", i.e. by minimizing lack
of closure. Isomorphous/anomalous difference refinement is carried
out with the program GREF, and has the advantage that only data from
the crystal being refined is used (and the native, in the isomorphous
case). It is therefore independent of all other derivatives, and is
particularly useful in the case of common sites between multiple
derivatives since there can be no "cross talk" or bias. Also, if this
refinement is carried out against centric data only, there are few
assumptions made about the protein phase and the refinement is
usually very reliable. It is nearly always used for the first
derivative as no reliable protein phase estimates are available at
that time, but it's not a bad idea to do this initially for each
derivative. The disadvantage is that refinement of all parameters with
centric data may not be possible in some space groups. For example,
in P2 the only centric data available are of the type h0l, thus one
can not refine ANY y coordinates. In GREF one can have the program
automatically include the 25% strongest differences for acentric data
along with the centric data to enable refinement of SOME y's, but then
one is introducing assumptions about the protein phase which are only
approximately valid, thus weakining the refinement. Also, in a space
group like P2 the origin is not fixed in the y direction, so even the
RELATIVE y coordinates BETWEEN DERIVATIVES can not be refined, even
when acentric data ARE included. For refinement in GREF one would
generally start by assigning the major site an occupancy of 1.0, other
sites appropriate occupancies and all heavy atoms B values of 15 or
20. Then do a single cycle refining only the scale factor (which you
can initially assign any positive value, usually 0.1). After an
estimate of the scale factor is obtained, one can refine coordinates
as appropriate, along with the scale factor. One can then refine
coordinates, scale factor and occupancies simultaneously, but the
occupancy of the MAJOR SITE should ALWAYS be held fixed at 1.0. Also,
when polar axes are present even with sufficient data available for
refinement (e.g. acentrics included) the coordinates of the MAJOR SITE
along POLAR DIRECTIONS should still NOT be refined, as they are needed
to fix the origin in the polar directions. Finally, if the resolution
is sufficiently high one can then include B values in the refinement,
but if there are indications of instability the B's can usually be
held at their initial values without introducing much error. Attempts
to simultaneously refine parameters which are not independent (i.e.
coordinates for ALL atoms along polar directions, both scale factor
AND occupancy when only one atom is input, ALL coordinates of an atom
on a special position in the space group) or to refine parameters when
there is no data determining that parameter (coordinates of ANY atom
along polar direction when using ONLY centric data) will result in
a singular matrix being obtained, and an aborted refinement.
Paramaters obtained from refinement against amplitude differences are
generally well suited to initiate subsequent phasing calculations or
further "phase refinement" in program PHASIT.
"Phase refinement" is carried out with the program PHASIT, and is
ideally suited for refinement with multiple derivatives/data sets
although it can also be used with a single derivative. In PHASIT
either conventional refinement, or "maximum likelihood" options may
be selected, and if only one derivative is used the program will
automatically switch to maximum likelihood mode. Phase refinement
requires an estimate of the protein phase, which is why it's better
suited for the multiple derivative case, since SIR or SAS estimates
alone are usually very poor. The advantages of phase refinement are
that in general, all parameters may be refined including native to
derivative scaling parameters, and the corresponding weights (expected
lack of closure estimates) are also implicitly "refined". Since the
origin is fixed by the protein phase estimates, refinement is possible
for coordinates along polar directions, and the origin can thus be
properly established between derivatives. Phase refinement is however,
sensitive to the hand of the heavy atom sets, and it is assumed that
all input sets correspond to the SAME origin and hand. A useful
procedure is to initially start with all parameters as described
above, (after correlating origins and hand between derivatives with
cross difference Fouriers) and then refine only the FH scale factor
for one cycle. Then refine the FH scale factor along with coordinates,
then along with coordinates and occupancies (again always holding the
MAJOR site occupancy to 1.0 in each derivative). Then one can include
the FPH scale factor along with the other parameters, and finally
include the B factors. When refining anomalous scattering data sets
one would generally do the same, except that both the FPH and FH scale
factors should NOT be refined simultaneously (they can be alternated),
and for NATIVE anomalous scattering the FPH scale factor should NEVER
be refined!
A useful option is to use protein phase estimates obtained from an
external source during phase refinement rather than calculated from
the current heavy atom parameters and data. Thus one could refine
initially as described, then modify the phases by solvent flattening
and/or NC symmetry averaging, and then refine the parameters again
this time against the modified phases. The new parameters are then
used to compute phases to start another round of density modification.
This procedure has been helpful in several cases, and usually is
particularly good for refining the FPH scale factor. For conventional
phase refinement in PHASIT one would select a figure of merit cutoff
in the range 0.4 to 0.6 and use weights of 1/E**2. For maximum
likelihood mode one would select a figure of merit cutoff in the range
0.1 to 0.2 and use unit weights. Most successful protein structure
determinations have utilized phase refinement to obtain the final
MIR type phases, although refinement against differences is often
done first to obtain starting values for the parameters.
6.00 HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE AND
CROSS DIFFERENCE FOURIER MAPS
There are several ways to compute heavy atom based difference
or cross difference type Fourier maps within the PHASES package.
1) HEAVY ATOM DIFFERENCE or DOUBLE DIFFERENCE FOURIERS.
The first approach is to refine heavy atom parameters against
isomorphous or anomalous difference AMPLITUDES in program GREF,
and request that a Fourier file be written. If this file is
used in FSFOUR with MAPTYP=1, then the observed difference
Fourier, i.e. that which should reveal all heavy atom sites
can be obtained. If the file is used with the MAPTYP=3 option,
then a "double difference" map is computed, i.e. the heavy
atoms included in the structure factor calculation are subtracted
out, so that the map should show only additional sites. The
limitations with this approach are that the "observed" amplitudes
ABS(FPH-FP) or ABS(F+ - F-) are approximations, since vector
differences rather than amplitude differences should be used,
and that the heavy atom model may be crude since the FPH to FP
scale factor has not been refined, and anisotropic thermal
parameters for the heavy atoms can not be used. Also, if used
with anomalous data the absolute configuration can not be
obtained since absolute values of delta F were used.
The second approach is to compute phases and/or refine heavy
atom parameters in program PHASIT, and use the "difference
coefficients" files it produces in FSFOUR with the MAPTYP=1
or MAPTYP=3 options. If MAPTYP=1 the "observed" difference
Fourier showing all heavy atoms will be obtained, however the
results should be improvements over those obtained with the
previous method. This results from the fact that the FPH to
FP scaling parameters can be refined, the heavy atom thermal
factors may be refined anisotropically, and phase difference
information is used to correct the "observed" amplitudes to
account for the fact that the two vectors are not colinear.
In this case for isomorphous data sets the corrected
"observed" differences, calculated heavy atom amplitudes and
calculated heavy atom phases are used to compute the map.
For anomalous data the "observed" and "calculated" Bijvoet
differences are used along with the protein phases shifted
by 90 degrees, to give true "Bijvoet difference" or "Bijvoet
double difference" maps, so that the absolute configuration
is preserved. Again, MAPTYP=1 should show all anomalous
scatterers while MAPTYP=3 should have those included in the
model subtracted out. These methods use phase information
computed only from the heavy atoms or anomalous scatterers,
although in the anomalous case all such information is
combined to estimate protein phases.
A third approach is to combine observed AMPLITUDE
differences [ i.e. (FPH-FP) or (F+ - F-) ] directly with
estimates of the protein phases to compute difference
or Bijvoet difference Fouriers. One would then generate
protein phases either by MIR, SIR, BNDRY, or from a model in
PHASIT, and combine the phases with observed amplitude
differences in programs MRGDF or MRGBDF for isomorphous or
anomalous data, respectively. The maps would then be computed
in FSFOUR using the MAPTYP=3 or MAPTYP=8 options for
difference or Bijvoet difference maps, respectively. The
advantage of this approach is that the protein phases
themselves may be better, since one can use solvent flattened
and/or NC symmetry averaged phases in the synthesis. For
the isomorphous case the output coefficients file would then
contain indices, FPH, FP, PHI_pro, and for the anomalous
case the file would contain indices, F+, F-, PHI_pro. A
disadvantage is that one can not "subtract out" the heavy
atoms used in the phasing, so that they will also appear
in the maps possibly making it more difficult to detect
minor sites.
2) CROSS DIFFERENCE FOURIERS.
This is accomplished similarly to the third option above,
except that in MRGDF or MRGBDF a data file corresponding to
a new derivative, i.e. one which was never used in phasing,
is merged with an existing protein phase file. The cross
difference Fourier (or cross Bijvoet difference Fourier)
is then obtained in FSFOUR with MAPTYP=3 or MAPTYP=8,
respectively. These maps should show all heavy atom or
anomalous scatterer sites in the new derivative, which can
then be checked against the appropriate difference
Patterson. The advantage of doing this, in addition to
helping solve the new derivative, is to assure that heavy
atom sites in the new derivative correspond to the same
origin and hand as those used in the original phasing.
7.00 CREATING/EDITING SOLVENT MASKS
In most cases adequate solvent masks are prepared as part of the
"doall" procedure, which carries out a reciprocal space equivalent
of the automated protein-solvent boundary determination method
described by Wang with the added modification that density in the
immediate vicinity of heavy atoms is ignored during mask construction.
Solvent masks however, can also be created by hand, from coordinates
for an input model or by starting with any of these masks and editing
them. Solvent masks MUST have a one-to-one correspondence with FSFOUR
maps, and thus they also MUST cover one full cell on the same grid
used for the map, and be oriented as xz sections. They also must have
the structure as described in the "file formats" section. This happens
automatically if the masks are constructed by the "doall" procedure,
but care must be taken to insure these features if the masks are
created by other means. Pre-existing solvent masks can be examined
and/or edited in MAPVIEW, or MAPVIEW can be used to create the masks
"from scratch" by hand tracing boundaries in contoured maps. Several
options are now described.
*** Examining/editing "normal" (i.e. full cell) solvent masks ***
These masks (named mask1.14, mask2.14 and mask3.14, if created by
the "doall" procedure) can be examined in MAPVIEW or MAPVIEW_X by
inputting any FSFOUR map with the same grid, specifying that masks
will be used, selecting 0 to 0.999 for each of the x, y and z ranges,
specifying the xz section orientation and "recovering" the
pre-existing mask file. From the menu contoured sections can then be
selected and displayed. Clicking the mouse with the cursor in the
"show mask" menu area will then display the solvent mask as blue dots
on the solvent grid points. One can then use the menu options to
scroll through the sections, displaying both contoured density and
the solvent mask. One could also use the "trace mask" menu option
as described in the MAPVIEW writeup to edit the mask with the mouse,
but at this point it is not desirable to do this as the full cell
map is displayed, and one may have to make identical edits in each
symmetry related envelope. If this is not done very carefully one
could easily destroy the space group symmetry in the mask. A better
approach, if editing is to be done, is simply to examine the map and
mask to determine the coordinate range which would carve out only one
contiguous molecule (asymmetric unit) by following the fractional
coordinates as the cursor moves across the screen (displayed in the
lower right hand corner). Note that when determining the range one can
cross into neighboring cells, although only the one-cell-translated
map region is displayed. Once an appropriate range is deduced, write
it down and exit MAPVIEW without saving any files. Then run EXTRMAP
and EXTRMSK to extract that same range from the FSFOUR map and solvent
mask, respectively, to create the corresponding "submaps". This
allows one to deal only with a contiguous asymmetric unit, and to
select regions spanning cell edges. Now run MAPVIEW again this time
inputting the non-fsfour (i.e. submap) and its corresponding mask
file. Editing can then be done on the submask. After editing all
appropriate sections, use the "MAKEASU" menu option to symmetrize the
submask, and scroll through the masks again to confirm that everything
is as desired. Once you are happy with it, exit MAPVIEW and when
prompted, request that the entire submask region be saved to a file.
At this point you have the edited mask covering an asymmetric unit.
Run BLDCEL inputting the submap, edited submask and original FSFOUR
map to expand the submask (and submap) to a full cell. You can delete
the output map file, but the output mask file now corresponds to the
edited solvent mask, expanded to a full cell obeying space group
symmetry. It can now be used for solvent flattening (or examined
again in MAPVIEW just as the original mask was to confirm the
expansion).
***** Creating solvent masks from a model *****
If atomic coordinates are available from a tentative model, these
coordinates can be used to create a solvent mask. To do this one
should first prepare a PHASES style file containing the atomic
coordinates (possibly from a PDB file via PDB_CDS), and determine the
range (in fractional coordinates) which encompasses the model atoms.
Then enlarge the range (on each end) slightly to account for the
radius to be assigned to each atom. MDLMSK can then be run to create
a mask file just encompassing the molecule. When prompted in MDLMSK,
the periods (number of grid points along each axis) should be
specified EXACTLY as in the input to FSFOUR, to insure that the maps
to be computed later will have the same grid as the mask. The adjusted
fractional coordinate range for the model should then be specified
along with a mask number (use 1 for pure solvent masks), and a radius
of about 1.8 angstroms. In the mask the outer boundary will be
appropriate, but there will typically be many small holes in the
interior caused by use of a Van der Waal's size radius. Use of a
larger radius could avoid these holes, but would artificially extend
the outer boundary. To avoid this one generally uses the smaller
radius, and then edits the masks to preserve the outer boundary but
fill in the interior holes. This can be done very quickly in MAPVIEW.
To do this run MAPVIEW inputting a FSFOUR map (any one will do, as
long as the periods are the same as that used in MDLMSK) and request
that masks will be used. Then input the same coordinate range as in
MDLMSK, request the xz section orientation and "recover" the mask
file from MDLMSK. You can effectively turn off the density display
by selecting a high contour level, and scroll through the sections
editing each via the "show mask" and "trace mask" options described in
the MAPVIEW writeup. Just quickly trace around the already displayed
outer boundary to preserve it, and the interior holes will be filled
automatically when you are done with each section. When finished, use
the "MAKASU" option to symmetrize the mask region. Then exit MAPVIEW
and request that both the entire map and mask regions be written to
files. You then will have an edited mask file encompassing the model,
and the corresponding submap file. The last step is to convert the
edited mask to a full cell mask. To do this, run BLDCEL inputting the
submap, corresponding edited mask and original FSFOUR map. The output
map file can be deleted, but the output mask file will be a full cell
version of the edited, model based mask which now also obeys space
group symmetry. It can then be used for solvent flattening (for
example, replacing the mask3.14 file in the cycle16.sh, extnd.sh or
extndavg.sh procedures), and also can be examined in MAPVIEW as
described earlier.
8.00 INCORPORATION OF PARTIAL STRUCTURE INFORMATION
In many cases a significant fraction of the structure can be
reliably determined from an electron density map, but some regions in
the map are less well defined. In that case it is often useful to
incorporate phase information obtained from the partial structure into
the phasing process. This can be done in several ways, all of which
require running the PHASIT program once (in SF calculation mode,
IHLCF=0, ISIGA=0) to generate partial structure phases and
amplitudes, and running the BNDRY program once (option 3, with
ICMB=0 or 1) to combine the partial structure phase information with
prior phase probability distributions cast in terms of Hendrickson-
Lattman coefficients. Different strategies can be employed depending
on which prior distributions are used, weighting during the phase
combination and what is done AFTER the phase combination step. The
most common procedures are now described.
In all procedures, first run PHASIT in SF calculation mode using
IHLCF=0 and ISIGA=0, and call the output phase file MODEL.31.
This file contains the partial structure phase and amplitude
information.
Now you have some choices.
1) Combine the partial structure information with the original (MIR,
SIR etc) probability distributions (usually in file called PHASIT.31
generated by PHASIT, but possibly introduced via the IMPORT program).
This can be done with a small control file to run BNDRY, option 3,
using ICMB=0 or 1 for either Sim or Sigma_A weighting, respectively,
(see BNDRY write-up) during phase combination. Call the output file
PHICOMBINED.ORG. This file will contain phase, figure of merit and
HL coefficients for the COMBINED data. If the partial structure was
large enough, you may be able to use these phases directly to get a
good map.
2) If you want to proceed with solvent flattening cycles, just copy
the file PHICOMBINED.ORG to PHASIT.31 (first saving the ORIGINAL
PHASIT.31 i.e. no partial structure contributions, in another file).
Now you can invoke the default procedure DOALL.COM without changing
anything, and at each phase combination step the MODEL+SIR etc
distributions will serve as the "anchored" phases with which those
newly obtained from solvent flattening will be combined.
3) If you wish instead to combine the partial structure information
with distributions obtained AFTER solvent flattening, do the same as
in 1), but use the best phases available (usually in file obtained
from a previous run called phi16cy.31, phiextnd.31 etc) instead of the
original PHASIT.31 file. Call the output file PHICOMBINED.FIN. One
could then proceed with solvent flattening cycles as in 2), but
usually this is not necessary and the phases in file PHICOMBINED.FIN
are used for the final map.
These 3 options (partial structure + MIR etc with no flattening,
[partial structure + MIR etc] followed by flattening, and parital
structure + flattened MIR) seem to be the most useful, and can all be
carried out without tampering with the default control files. One only
has to create additional small control files for single runs of PHASIT
(SF mode) and BNDRY (option 3). Other options making use of partial
structure information are described in the section on "REDUCED BIAS
NATIVE, COMBINED AND DIFFERENCE FOURIERS."
Note that in the case where a molecular replacement solution was
obtained, then one has no MIR like phase probability distributions to
combine solvent flattened (e.g. map inverted) phases with. In that
case (or if one has MIR phase information, and simply wants to abandon
it), one can use PHASIT in SF calculation mode but with IHLCF=1 and
ISIGA=0 or 1. That will create Hendrickson-Lattman coefficients for
the partial structure, and these distributions can then be used as the
"anchored" phases with which those newly obtained from solvent
flattening will be combined. This may also be useful if want wants to
tie noncrystallographic symmetry averaged phases to model phases.
Another option is available involving phase extension. Suppose one has
MIR, SIR etc data to only 4.0 angstrom resolution, native data to 3.0
angstrom resolution, and a partial structure available. One could
first compute the partial structure phases out to 3.0 angstrom
resolution (PHASIT SF mode, IHLCF=0, ISIGA=0) and MIR etc phases out
to 4.0 angstrom resolution (PHASIT, phasing mode). Then run MISSNG to
get the file "extrfl.d" containing reflections between 4 and 3
angstroms. The three output files could then be combined in a single
run of BNDRY (option 3, ICMB=0 or 1, with phase extension requested)
to get a hybrid file. The output file would then contain MIR combined
with partial structure phases to 4 angstroms, and partial structure
phases between 4 and 3 angstroms. This file could then be used for
direct calculation of a map or to initiate solvent flattening cycles
as described earlier. Yet another variation would be to do the same
thing but requesting IHLCF=1 and ISIGA=0 or 1. In that case during
solvent flattening iterations the map inverted phases would also be
tethered to the partial structure phases for the high resolution
data.
Clearly many other options or sequences are available. The key to
successful use of the programs in this fashion is understanding that
the phase combination program (BNDRY, option 3) merges at least two
files, one of which must contain phase information cast in
Hendrickson-Lattman coefficients, and the other containing only
calculated phases and amplitudes (possibly to higher resolution). If
phase extension is also desired, a third file with the additional
reflections may contain only indices and amplitudes, but it may also
contain phase probability distribution coefficients for some or all
of the reflections. The output file always contains the COMBINED
information cast in the HL coefficient form. It is thus suitable for
use either for direct map calculations or as an input file for the
BNDRY, MISSNG, MRGDF, MRGBDF, RD31 etc programs.
9.00 REDUCED BIAS NATIVE, COMBINED AND DIFFERENCE FOURIER MAPS
When making use of partial structure information, either obtained
from a model via structure factor calculations or from inversion of
a density map, the resulting phases are always biased towards the
partial structure. Read (Acta Cryst. A42, 140-149, 1986) has shown
how this bias can be reduced significantly when using the partial
structure phases directly for map calculations, and how to properly
weight partial structure derived phase information when combining
it with other (e.g MIR, SIR etc.) phase information. Both procedures
require first determining "Sigma_A", which is related to the
contributions from "missing" or "incorrect" parts of the structure
and varies with resolution, to compute the proper weight (and thus
FOM) for the partial structure phases. The procedures described by
Read have been implemented as options in both PHASIT and BNDRY,
and can be invoked as follows:
(1) COMBINED PHASE MAPS
For simple phase combination the Sigma_A procedure can be invoked in
the BNDRY program (option 3), by setting ICMB=1. Sigma_A weighting is
then used instead of Bricogne's modification of Sim's weighting
scheme during phase combination. This can be used either with model
phases or map inverted phases, and thus can be done automatically
in the "doall" or "extndavg" procedures.
(2) REDUCED BIAS DIFFERENCE MAPS
These maps are similar to conventional Fo-Fc maps phased with the
partial structure, but the coefficients are
FOM * FOBS - D * FCALC * exp(i * phicalc)
where D is derived from the Sigma_A values and phicalc is the phase
from the partial structure. The appropriate map can be produced by
running PHASIT, SF mode with ISIGA=2, and then requesting a Fo-Fc map
in FSFOUR. Note however, that if chosen, FOM*FOBS and D*FCALC will
occupy the Fo and Fc slots in the output file, thus other map types
requiring pure Fo and/or pure Fc values will be inaccessible.
(3) REDUCED BIAS NATIVE MAPS
These maps are similar to conventional 2Fo-Fc maps phased with the
partial structure, but the coefficients are
2 * FOM * FOBS - D * FCALC * exp(i * phicalc)
for acentric reflections where D is derived from Sigma_A and
FOM * FOBS * exp(i * phicalc)
for centric reflections.
The appropriate map can be produced by running PHASIT, SF mode setting
ISIGA=3, and then requesting a 2Fo-Fc map in FSFOUR. Note however,
that if chosen, FOM*FOBS and D*FCALC (acentric) or FOM*FOBS/2 and 0
(centric) will occupy the Fo and Fc slots in the output file, thus
other map types requiring pure Fo and/or pure Fc values will be
inaccessible.
(4) SIGMA_A WEIGHTED PROBABILITY DISTRIBUTION COEFFICIENTS
Phase probability distribution coefficients and corresponding FOM
based on Sigma_A weighting for structure factors computed entirely
from an atomic model can be obtained by running PHASIT, SF mode with
ISIGA=1. The file may then be used as the "anchor" phases to which map
inverted phases are tethered. It is particularly useful for merging
information during phase extension when high resolution phases come
from a partial structure and lower resolution phases come from MIR
type calculations.
10.00 INCORPORATION OF NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING
Whenever there are multiple copies of identical molecules present
in the crystallographic asymmetric unit and/or the same molecule is
present in multiple crystal forms, one has the opportunity to
improve the phases by averaging the corresponding electron density in
the related molecules, replacing the density for each molecule with
the average, and inverting the "averaged" density map(s) to obtain new
structure factor amplitudes and phases. These new amplitudes and
phases can then be accepted immediately, but are more frequently
combined with the original MIR, SIR etc phase information in a
probabilistic manner, just like those obtained from solvent flattening
or from a partial structure. Indeed, solvent flattening and imposition
of non-negativity of electron density can be applied in addition to
the noncrystallographic symmetry averaging, leading to powerful
phasing algorithms. The resulting phases (either alone or combined
with MIR, SIR etc), are typically combined with the observed
amplitudes, and the process is cycled until convergence is obtained.
The power of the method increases as the number of molecules averaged
increases, but averaging over even a dimer is still extremely useful
when combined with MIR, SIR data, etc. Programs required to carry out
the steps needed for successful noncrystallographic symmetry averaging
are currently included in the PHASES package, and sample control
scripts are given (called "extndavg.sh and extndavg_mc.sh", for the
single and multiple crystal averaging cases, respectively) which
replace the "extnd.sh" script in a normal solvent flattening run. The
scripts insert the averaging related steps into the normal solvent
flattening process, thus the complete multi-cycle task can be carried
out by executing them. Prior to running the scripts however, there are
several related tasks to be performed, which include determination of
the location, direction and nature (rotational order) of the
noncrystallographic symmetry operator(s), and construction of one or
more "averaging envelopes" or "averaging masks" delineating the
volume(s) occupied by the molecules to be averaged. Initial estimates
for the noncrystallographic symmetry operator(s) are usually obtained
from rotation/translation functions which are not included in the
PHASES package as they are readily available elsewhere, however if the
operators are specified by 3x3 rotation matrices and 3 element
translation vectors (as for example, in the program "O"), then the
PHASES program O_TO_SP can be used to convert them to PHASES format.
Everything else, including refinement of the operator(s) and
construction of the envelope mask(s) is part of PHASES. All map
interpolation programs (MAPAVG, SKEW, MAPORTH etc) utilize powerful 64
point spline algorithms, thus the map grids for averaging need not be
any finer than for normal calculations. Many of the noncrystallographic
symmetry averaging routines in PHASES were derived from programs
originally written by W. Hendrickson & J. Smith. In most instances they
have been heavily modified for use in PHASES, mostly to generalize the
algorithms, to optimize the code, and to provide compatabilty with the
rest of the package. The general averaging process as implemented in
PHASES is described below.
For both simplicity and reasons related to computational
efficiency, all of the averaging related calculations are best
performed on electron density "submaps," which cover only the map
region encompassing an asymmetric unit containing the molecules to be
averaged. This "asymmetric unit" need not be complete in the
crystallographic sense (that is, it may differ from a true asymmetric
unit in volume and have irregular borders), but it must encompass at
least the molecules to be averaged, although solvent regions may be
omitted. It may also span cell edges, if necessary. Since the standard
FSFOUR maps always cover a complete unit cell, the "submaps" (which
have a different format) can be created from them via the programs
MAPVIEW or EXTRMAP. Indeed, MAPVIEW will almost certainly be needed to
determine which region to extract in the first place. All envelope
creation, averaging, operator refinement, skewing etc will be done
using the submaps. After appropriate regions in the submaps are
averaged, program BLDCEL is used to regenerate complete unit cell maps
(FSFOUR format) conforming to the space group symmetry, which can then
be inverted by MAPINV. Thus MAPVIEW (or EXTRMAP) and BLDCEL serve as
the gateways between normal FSFOUR maps and submaps. Note that MAPVIEW
can display either type of map (and mask). Descriptions of the inputs
required for each of the programs mentioned can be found in the
appropriate program write-ups.
The keys to successful averaging are to obtain good "envelope"
masks which accurately identify the volume(s) in space in which the
noncrystallographic symmetry operator(s) is/are valid, and to obtain
accurate values for the operators themselves. These tasks always will
take one of two routes, depending on the nature of the
noncrystallographic symmetry. Within a given crystal, if the NC
symmetry is purely rotational with the order of rotation being N-FOLD,
where N is a small integer, then the task is simplified since one
needs only a single "envelope mask" which encompasses all N of the
molecules related by NC symmetry. That is to say the averaging can be
done without having to specify where one molecule stops and the next
starts. One only needs to know the bounds of the TOTAL AGGREGATION of
molecules. The procedure (A) below is then adequate to carry out the
necessary computations. If an arbitrary rotation angle and/or a
translational (eg screw like) shift is involved, the task is more
complicated since one then must create a SEPARATE ENVELOPE MASK
identifying each molecule. The procedure (B) below is then adequate
to carry out the computations. For multiple crystal averaging the same
steps and considerations are required, but multiple submaps (one for
each crystal form, along with corresponding envelope masks) are used.
Details related to multiple crystal averaging are described later.
(A) MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH PURE
ROTATIONAL SYMMETRY OF ORDER N
1) start with best possible map (usually solvent flattened MIR map, as
obtained via the "doall" procedure).
2) compute a map via "FSFOUR" (default orientation, i.e NORN=0)
3) run EXTRMAP (or MAPVIEW) to extract a submap from the FSFOUR map
which encompasses at least the dimer, trimer etc, related by KNOWN
(at least approximately) noncrystallographic symmetry.
4) if the unit cell is not orthogonal, run MAPORTH to convert the
submap to an orthogonal grid (but save the input submap as well)
5) run LSQROT (using orthogonal map), to refine the noncrystallographic
symmetry axis location and direction. Start with low resolution
(~6A map, 2A grid) refining only within a sphere of suitable radius
(usually 12-25A), centered about a point on the rotation axis which
is near the dimer, trimer etc center. Then gradually extend the map
resolution to about 3A (1A grid) and repeat the refinement. In a 4A
map, the correlation coefficient after refinement should be about
0.4 or higher. (Ignore the R factor, its always very high).
6) run SKEW (using the submap from 3), to generate a "skewed" map
with new "b" axis aligned with noncrystallographic symmetry axis.
7) run MAPVIEW (using "skewed" map) to create a mask (via "trace mask"
option) which encompasses only the region to be averaged. This
should include the entire dimer, trimer etc. In MAPVIEW, use only a
single mask (Mask No. 1). When exiting, save the "skewed" mask file.
8) run TRNMSK (using both the original submap from 3, and "skewed"
mask from 7 to convert the skewed mask to one corresponding to the
default (non-skewed) orientation (its grid will have one-to-one
correspondence with the original submap). Save this standard mask.
9) run MAPVIEW (using the original submap from 3), and "recover" the
standard mask file from 8. Then use "Make Asu" option, and possibly
edit masks until only non redundant density associated with the
desired dimer, trimer etc is within the mask. When exiting, save the
ENTIRE mask (no subset). It will be used in all future averaging
cycles.
Optionally, run LSQROT again this time using the default mask
output from 9 as basis for refinement (you may have to orthogonalize
it), instead of a sphere. If you do this, expect a drop in the
correlation coefficient. If the orientation changes significantly,
repeat steps 6-9.
Proceed to AVERAGING STEPS
(B) MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH ARBITRARY
ROTATIONAL ANGLE AND/OR TRANSLATION
Steps 1-4 same as in (A)
5) Run LSQROTGEN (using orthogonal map), to refine the
noncrystallographic symmetry operators relating molecule 1
(arbitrarily selected) to each other molecule. Start with low
resolution (~6A map, 2A grid) refining only within spheres of
suitable radius (typically 15A) centered on points near the centers
of molecule 1 and the target molecule, respectively. Then gradually
extend the map resolution to about 3A (1A grid) and repeat the
refinement. In a 4A map, the correlation coefficient after
refinement should be about 0.4 or higher. (Ignore the R factor, its
always very high). For N related molecules, there will be N-1
operators to refine.
6) Run MAPVIEW (using the submap from 3) to create SEPARATE envelope
masks for EACH MOLECULE to be averaged. Do this by making use of
the "set mask no." and "trace mask" options. When exiting, save
the mask file, as it now contains separate envelope information
for each molecule. Also, remember which mask No. you assigned to
which molecule.
7) Run MAPVIEW (using original submap from 3), and "recover" the
standard mask file from 6. Then use "Make Asu" option, and possibly
edit masks until only non redundant density associated with the
desired dimer, trimer etc is within molecular envelope masks. When
exiting, save the ENTIRE mask (no subset). It will be used in all
future averaging cycles.
Optionally, run LSQROTGEN again this time using the default mask
output from 7 as basis for refinement (you may have to orthogonalize
it), instead of spheres. If you do this, expect a drop in the
correlation coefficient. If the operator(s) change significantly,
repeat steps 6-7, otherwise continue.
AVERAGING STEPS
Prior to brute force cycling, run MAPAVG (using the original submap
from 3, and the corresponding mask from 9A or 7B) to generate an
"averaged" map. If the translation is small (or absent) use "SKEW" to
convert it so you can look down the NC symmetry axis. You can then use
"MAPVIEW" to view the map, and verify that averaging has indeed been
done successfully, that you are in fact looking down the NC symmetry
direction, and the axis goes through the origin. If so, proceed to
averaging cycles. If not, something went wrong earlier. Check program
inputs, outputs, polar axis conventions, etc.
At this point refined values of the noncrystallographic symmetry
operator(s) are available, along with envelope masks isolating the
regions to be averaged within the submap.
1) create the file "extrmap.d", which will specify what submap region
to extract from the FSFOUR map. It MUST correspond EXACTLY to the
same region used when creating the envelope masks. (You can read
the envelope mask header with RDHEAD if you forgot). Rename the
final mask file "asu.msk" See EXTRMAP write-up for information.
2) create the file "mapavg.d", to specify the transformation
operator(s) for averaging, and the envelope mask file. See MAPAVG
write-up for information.
3) create the file "bldcel.d", to specify the file names and options.
BLDCEL will take the "averaged" asymmetric unit submap from mapavg,
and build a complete cell FSFOUR style map from it. See BLDCEL
write-up.
4) Create the file "sloext.d" specifying phase extension information
and cycles to be performed (see SLOEXT write-up). If no phase
extension is to be done, make the upper and lower resolution
cutoffs identical and specify 16 cycles. Otherwise, specify the
resolution cutoffs and cycles per resolution increment, and run
MISSNG to create the "extrfl.d" file.
5) Create the file "extnd.d" specifying file names, extension options
and I/O type.
6) Verify that the phase files (phasit.31 and phi16cy.31), solvent
mask (mask3.14), and data files (bnd2.d, fft.d, minv2.d) from a
previous "doall" run are available.
7) Run the procedure "extndavg.sh". It will carry out the cycles of
NC symmetry averaging/solvent flattening/phase combination/phase
extension steps to combine "averaged" phases with the original MIR
phases.
***** CREATING AVERAGING ENVELOPE MASKS FROM A MODEL *****
If coordinates from a tentative model are available, they can also
be used to create the averaging envelope masks. The procedure is
esssentially that described in the CREATING/EDITING SOLVENT MASKS
section, with a couple of minor exceptions. First, after the initial
mask is constructed in MDLMSK and edited in MAPVIEW as described,
one is finished since unlike solvent masks, there is no need to make
"full cell" averaging masks with BLDCEL. Second, if the NC symmetry
operation involves arbitrary rotations and/or post rotation
translations, then MDLMSK must be run multiple times; once for each
NC symmetry related molecule. In each run a separate file should be
written and a different mask number must be used, but each file must
cover the same range (which is large enough to cover ALL copies). The
particular mask numbers used must be remembered as they will be needed
later when specifying which transformation operators are to be used in
MAPAVG to relate the molecules. The individual mask files should then
be edited and saved as described in the CREATING/EDITING SOLVENT MASKS
section. Once edited, the individual mask files must be combined into
a single mask file with program MRGMSK. The output file from MRGMSK
then can be used for averaging (i.e. as "asu.msk" in the "extndavg.sh"
procedure).
The output masks from MDLMSK or MRGMSK can be used for averaging
as long as a corresponding map region is provided. Thus the input used
to create the submaps ("extrmap.d" in the "extndavg.sh" procedure)
must specify the same range, and the map must have the same periods.
If the "extndavg.sh" procedure is to be used, then the solvent mask
must also be created from a map having the same periods. The output
masks can be examined/edited in MAPVIEW, again as long as the
corresponding map region (either explicitly selected from a FSFOUR map
in MAPVIEW, or previously extracted from a FSFOUR map by EXTRMAP and
input to MAPVIEW as a non-FSFOUR map) is provided. Once the output
masks from MDLMSK or MRGMSK are obtained, they can be used just like
any other averaging mask file, i.e. used for operator refinement in
LSQROT or LSQROTGEN, used in MAPAVG, manipulated in SKEW, BLDCEL etc.
10.01 AVERAGING WITH MULTIPLE CRYSTALS
All of the submap extraction and mask preparation steps used in
single crystal averaging as described earlier must be carried out
independently for each crystal, thus multiple submap and corresponding
mask files must be created. If a given crystal also contains multiple
NC symmetry related copies WITHIN IT, then the operators relating
molecule 1 to each of them must also be refined exactly as described
in the single crystal case. This will allow both intra and inter
crystal averaging to be carried out simultaneously. In addition, the
operators relating MOLECULE 1 in CRYSTAL 1 to MOLECULE 1 in EACH
OTHER CRYSTAL must also be refined. This can be done in program
LSQROTGEN by specifying the appropriate input. Once all of the
required operators and envelope masks are obtained, averaging can
proceed by specifying the appropriate input to program MAPAVG, and
by preparing the required input files for EXTRMAP and BLDCEL for
each crystal. A script file "extndavg_mc.sh" is supplied for multiple
crystal averaging in the case where there are two crystals. It
can easily be modified to include more crystals (up to 6), and
comments are embedded in it explaining where modifications are to
be made. The main difference in the procedure for multiple crystal
averaging is that all of the normal input files must be duplicated for
each crystal, and the standard file names for maps, masks, data files
etc must be be modified to uniquely identify the appropriate crystal.
During each cycle of multiple crystal averaging the full cell maps are
created and the submaps are extracted independently for each crystal.
Then an averaged version (averaged over ALL copies) is created for
each submap. For each crystal, the averaged submap is then expanded to
its full cell version, solvent flattened, Fourier inverted and the
resulting phases and amplitudes combined probabilistically with the
appropriate MIR or SIR phases. Thus a new improved map can be obtained
for each crystal. With multiple crystal averaging however, there is
currently no facility for slow phase extension, thus the file
"sloext.d" is not needed and the number of cycles to be done is hard
wired into the "extndavg_mc.sh" script. Phase extension however, still
can be done. It's just that the appropriate cutoffs are supplied only
in the "extnd.d" files for each crystal, and are constant for all
iterations. One can of course, still extend the resolution gradually
by repeating the process with iteratively with different cutoffs and
input files.
10.02 AVERAGING DIFFERENCE OR 2FO-FC MAPS
One usually does the averaging/solvent flattening iterations on
normal electron density maps, but in some cases it may be desirable
to average FO-FC or 2FO-FC maps. Examples might be when trying to
identify inhibitors, activators etc. soaked in to known crystal
structures, or when trying to build up density for missing sections
of the macromolecule itself. This can be accomplished by proper
preparation of the input files, and changing the map type
specification in the fft.d input file. To do this one must assure
that both FO and FC are available on the INITIAL file (called
phi16cy.31 in the "extndavg.sh" or "extndavg_mc.sh" scripts) used to
create the first map, and on the OUTPUT file ("newphi.ref" or
"newphi_N.ref," etc.) produced in each iteration. For the output
files this is done by specifying IOTYP=1 in the BNDRY option 3 input
("bnd3.d" or "bnd3_N.d" etc.). For the INITIAL file, one could
obtain it from a single run of BNDRY, option 3, again specifying
IOTYP=1, or from a run of PHASIT, structure factor mode specifying
IHLCF=0 and ISIGA=0, depending on whether the phase information comes
from MIR type calculations or from atomic coordinates for a model.
Note however, that the "anchor" phase file (called "phasit.31" or
"phasit_N.31" etc.) which the map inverted phases will be combined
with MUST contain FM*FO and FO in the amplitude slots along with
probability distribution coeficients, as would be the case if the
file was created with a normal PHASIT run in protein phasing mode (or
structure factor mode if the "long format" output was requested). As
long as these files are properly prepared and the appropriate
coefficients are selected in the fft input, iterations using map
types involving FC's will be obtainable. One must be aware however,
that the final output file (phiextndavg.31) will then also have
FO and FC in the amplitude slots, and thus can only be used in
FSFOUR for straight or difference type Fouriers, and NOT for
figure of merit weighted Fouriers.
If one is averaging with molecular replacement derived phase
information and has already proceeded as described in the DENSITY
MODIFICATION WITH MOLECULAR REPLACEMENT DERIVED PHASE INFORMATION
section, i.e. the "doall" procedure has been run using the modified
inputs, one need only to change the filename "phasit.31" in the
"extnd.d" input file to "anchor.31". Then averaging can proceed with
the "extndavg.sh" script using all of the preexisting files, once the
averaging mask, extrmap.d, mapavg.d., bldcel.d sloext.d (and possibly
extrfl.d) files are created.
10.03 SAMPLE INPUT FILES FOR AVERAGING
***** SAMPLE INPUT FILES FOR AVERAGING WITHIN ONE CRYSTAL *****
Sample input files for the averaging steps follow, along with a
listing of the supplied template command files "extndavg.sh" and
"extndavg.com". The command files can be used in place of the normal
"extnd.sh" or "extnd.com" file in a solvent levelling run. They will
perform additional cycles of averaging/solvent flattening/phase
combination/extension starting with the phases in file "phi16cy.31",
combining the "averaged" phases with MIR, SIR phase information in
file "phasit.31", and extending phases to additional amplitudes on
file "extrfl.d". They assume that all of the files needed for a
normal solvent flattening run (fft.d, bnd2.d, etc) are available, and
that the third mask from a previous run (mask3.14) is still available
for solvent flattening. If the template script file is to be used
unchanged, then all filenames should be EXACTLY as in the examples
(except for standard parameter file). Only the data relating to
submap ranges, resolution limits, number of cycles and the NC
operators should be changed. The final phases will be written to file
"phiextndavg.31", and printed information to "extndavg.l". The
procedure is run simply by entering "sh extndavg.sh" (UNIX) or
"@EXTNDAVG.COM" (VMS).
-- Sample input file extrmap.d or extrmap.dat, to extract submap ----
pdc.pam
four.map
asu.map
-.42 .45 -.45 .42 -.08 .56
--- Sample input file mapavg.d, for averaging over pure twofold -----
pdc.pam
1
asu.map
asu.msk
asu.avg
2 1 1
-102.16 83.81 180.0 1.082 -.746 .316 0.0
--- Sample input file bldcel.d, builds complete cell from averaged
submap ---
pdc.pam
four.map
avgcell.map
asu.avg
asu.msk
0
--- Sample input file extnd.d, specifying phase combination
data and options ---
pdc.pam
3
1 2.75 1. 0 0
phasit.31
minv.ref
extrfl.d
newphi.ref
Note that if one doe NOT want to include phase extension, then the
first "1" should be changed to zero and the line containing
"extrfl.d" should be omitted (see BNDRY write-up).
--- Sample input file sloext.d, controlling no. of averaging cycles
and phase extension information ---
pdc.pam
3. 2.75 8
extnd.d
Note that if one doe NOT want to include phase extension, then the
value of the two resolution limits should be made equal and the 8
changed to 16 to do a total of 16 averaging cycles (see SLOEXT
write-up).
************* procedure extndavg.sh ***************
# MODIFIED TO INCLUDE NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING
#
# RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES
#
cp phi16cy.31 newphi.ref
ln phasit.31 minv.ref
#
# CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
sloext < sloext.d > extndavg.l
#
# PERFORM THE REFINEMENT/PHASE EXTENSION ITERATIONS USING
# THE THIRD MASK
#
# DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
# AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
# OF ITERATIONS IS REACHED)
while
test -r EXTND.TMP
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> extndavg.l
rm four.ref
#
# EXTRACT REGION FROM MAP APPROPRIATE FOR AVERAGING
extrmap < extrmap.d >> extndavg.l
#
# AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPE
mapavg < mapavg.d >> extndavg.l
rm asu.map
#
# REBUILD THE COMPLETE UNIT CELL
bldcel < bldcel.d >> extndavg.l
rm four.map asu.avg
mv avgcell.map four.map
#
# MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> extndavg.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> extndavg.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
# AND EXTEND PHASING TO ADDITIONAL AMPLITUDES
#
bndry < extnd.d >> extndavg.l
#
# ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
# DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
# THE LOOP)
sloext < sloext.d >> extndavg.l
#
done
#
mv newphi.ref phiextndavg.31
mv minv.ref allcoef.31
# THATS ALL
***** SAMPLE INPUT FILES FOR AVERAGING WITH MULTIPLE CRYSTALS *****
Sample input files for the averaging steps follow, along with a
listing of the supplied template command files "extndavg_mc.sh" and
"extndavg_mc.com". The command files can be used in place of the
normal "extnd.sh" or "extnd.com" file in a solvent levelling run.
It will perform 16 cycles of averaging/solvent flattening/phase
extension for each crystal starting with the phases in files
"phi16cy_1.31" and "phi16cy_2.31" for crystals 1 and 2, respectively,
and combining the "averaged" phases with MIR, SIR phase information
in files "phasit_1.31" and "phasit_2.31" for crystals 1 and 2,
respectively. It also will extend phases to additional amplitudes on
files "extrfl_1.d" and "extrfl_2.d" for crystals 1 and 2,
respectively. The script assumes that all of the files needed for a
normal solvent flattening run (fft.d, bnd2.d, etc) are available
for each crystal, and that the third mask from a previous run
(mask3.14) is still available for solvent flattening in each crystal.
In order to keep input data and files associated with the proper
crystal, the "normal" file names should have an "_N" inserted
immediately proceeding the extension, i.e. fft_1.d, fft_2.d etc would
replace fft.d for crystals 1 and 2, respectively. If the template
script file is to be used unchanged, then all filenames should be
EXACTLY as in the examples (except for the standard parameter file).
Only the data relating to submap ranges and the NC operators should be
changed. The final phases will be written to files "phiextndavg_1.31",
and "phiextndavg_2.31" for crystals 1 and 2, respectively, and printed
information will be written to "extndavg_mc.l". The procedure is run
simply by entering "sh extndavg_mc.sh" (UNIX) or "@EXTNDAVG_MC.COM"
(VMS). For each crystal it assumes that the following files exist
where "N" is replaced by the crystal number, and that the file
"mapavg_mc.d" exists to control the averaging.
phi16cy_N.31 Starting phases, to get first map
fft_N.d fft grid info
extrmap_N.d submap extraction info
asu_N.msk averaging mask
bldcel_N.d info for reconstruction of full cell map from submap
bnd2_N.d solvent flattening info
mask_N.map solvent flattening mask
minv2_N.d map inversion info
extnd_N.d phase combination info
phasit_N.31 Anchor phases, to be combined with map inverted phases
extrfl_N.d Additional reflections, if phase extension requested.
Additionally, it is assumed that ALL file names (apart from the
parameter files) REFERENCED WITHIN the control files above (e.g.
files referred to within fft_N.d, extrmap_N.d, bldcel_N.d, bnd2_N.d,
minv2_N.d and extnd_N.d) also include the appropriate "_N" insertion
modifying the "standard" file names to distinguish data for different
crystals. Some examples are given below.
-- Sample input files fft_1.d and fft_2.d for two crystals ---
pdc1.pam pdc2.pam
COMPUTE DENSITY MAP COMPUTE DENSITY MAP
0 144 80 120 1 0 20 0 0 0 0 0 80 128 120 1 0 20 0 0 0 0
four_1.ref four_2.ref
four_1.map four_2.map
-- Sample input files extrmap_1.d and extrmap_2.d ----
pdc1.pam pdc2.pam
four_1.map four_2.map
asu_1.map asu_2.map
-.42 .49 -.56 .59 -.13 .63 -.62 .77 -.31 .35 -.25 .92
--- Sample input file mapavg_mc.d, for averaging over two copies
in crystal 1 and four copies in crystal 2 ---
pdc1.pam
2
asu_1.map
asu_1.msk
asu_1.avg
2 1 2
78.140 95.988 179.646 0.796 -0.67 0.239 0.152
asu_2.map
asu_2.msk
asu_2.avg
4 1 2 3 4
77.369 83.131 179.597 9.935 3.468 2.652 0.194
163.989 115.624 -177.918 -7.586 -5.270 35.599 0.333
184.007 27.036 180.196 1.551 -.575 38.236 0.451
283.568 92.472 -28.468 -5.787 -15.019 0.729 37.247
--- Sample input files bldcel_1.d and bldcel_2.d, builds complete
cell from averaged submaps for each crystal ---
pdc1.pam pdc2.pam
four_1.map four_2.map
avgcell_1.map avgcell_2.map
asu_1.avg asu_2.avg
asu_1.msk asu_2.msk
0 0
************* procedure extndavg_mc.sh ***************
# SCRIPT FOR NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING IN THE CASE
# WHERE MULTIPLE CRYSTAL FORMS ARE USED
#
# This sample script is appropriate for the case where two crystals
# are to be averaged. It can readily be modified to include more
# crystals, by making additons in the four places as indicated.
#
# The single file "mapavg_mc.d" containing input for the mapavg
# program is assumed to be present to control the multi-crystal map
# averaging process. In addition, a series of files specific to each
# crystal is needed as described below.
#
# For each of the "N" crystals the following files are assumed to
# exist, where the "N" in the file name is to be replaced by the
# crystal number, i.e. 1, 2, 3, etc.
#
# phi16cy_N.31 Starting phases, to get first map
# fft_N.d fft grid info
# extrmap_N.d submap extraction info
# asu_N.msk averaging mask
# bldcel_N.d info for reconstruction of full cell map
# bnd2_N.d solvent flattening info
# mask3_N.14 solvent flattening mask
# minv2_N.d map inversion info
# extnd_N.d phase combination info
# phasit_N.31 Anchor phases, for combining with inverted phases
# extrfl_N.d Additional reflections, if phase extension requested
#
# Also, to distinguish data specific for each crystal all file
# names (other than the parameter files) REFERENCED WITHIN the files
# above should also have an "_N" inserted just prior to the file
# extension, where N is the crystal number.
#
# INITIALIZE TEMPORARY FILE NAMES FOR EACH CRYSTAL
#
# INITIALIZATION FOR CRYSTAL 1
cp phi16cy_1.31 newphi_1.ref
ln phasit_1.31 minv_1.ref
#
# INITIALIZATION FOR CRYSTAL 2
cp phi16cy_2.31 newphi_2.ref
ln phasit_2.31 minv_2.ref
#
# REPEAT ABOVE 4 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
#
# PERFORM 16 CYCLES OF PHASE EXTENSION/AVERAGING, USING THIRD
# MASK FOR EACH CRYSTAL
for cycle
in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
do
#
# COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 1
rm minv_1.ref
mv newphi_1.ref four_1.ref
fsfour < fft_1.d >> extndavg_mc.l
rm four_1.ref
#
extrmap < extrmap_1.d >> extndavg_mc.l
#
#
# COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 2
rm minv_2.ref
mv newphi_2.ref four_2.ref
fsfour < fft_2.d >> extndavg_mc.l
rm four_2.ref
#
extrmap < extrmap_2.d >> extndavg_mc.l
#
# REPEAT ABOVE 8 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
#
#
# HAVE ALL THE NECESSARY MAPS, NOW DO THE AVERAGING. (THIS STEP
# DONE ONLY ONCE, SINCE MAPAVG HANDLES ALL CRYSTALS AT SAME TIME)
#
# AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPES
mapavg < mapavg_mc.d >> extndavg_mc.l
#
#
#
# NOW DO THE SOLVENT FLATTENING, INVERSION AND PHASE COMBINATION
# SEPARATELY FOR EACH CRYSTAL
#
# REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 1
rm asu_1.map
bldcel < bldcel_1.d >> extndavg_mc.l
rm four_1.map asu_1.avg
#
# MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR
# CRYSTAL 1
ln mask3_1.14 mask_1.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2_1.d >> extndavg_mc.l
rm avgcell_1.map mask_1.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 1
mapinv < minv2_1.d >> extndavg_mc.l
rm mod_1.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
# AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 1
bndry < extnd_1.d >> extndavg_mc.l
#
#
# REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 2
rm asu_2.map
bldcel < bldcel_2.d >> extndavg_mc.l
rm four_2.map asu_2.avg
#
# MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR
# CRYSTAL 2
ln mask3_2.14 mask_2.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2_2.d >> extndavg_mc.l
rm avgcell_2.map mask_2.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 2
mapinv < minv2_2.d >> extndavg_mc.l
rm mod_2.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
# AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 2
bndry < extnd_2.d >> extndavg_mc.l
#
# REPEAT ABOVE 20 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
done
#
# RENAME THE FINAL OUTPUT PHASE FILES FOR EACH CRYSTAL
#
# FOR CRYSTAL 1
mv newphi_1.ref phiextndavg_1.31
mv minv_1.ref allcoef_1.31
#
# FOR CRYSTAL 2
mv newphi_2.ref phiextndavg_2.31
mv minv_2.ref allcoef_2.31
#
# REPEAT ABOVE 4 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
# THATS ALL
11.00 DENSITY MODIFICATION WITH MOLECULAR REPLACEMENT
DERIVED PHASE INFORMATION
When the initial source of phase information is from a model
derived by molecular replacement techniques, it is still sometimes
desirable to improve the phases by solvent flattening and/or NC
symmetry averaging. This may be the case when the molecular replacement
derived model represents only a fraction of the asymetric unit
contents, and the missing parts of the structure must still be found.
In such cases the solvent flattening etc. iterations require creation
of phase probability distribution coefficients for the partial
structure model to be used as "anchor" phases in the phase combination
step. Also, it is desirable to do the iterations on 2FO-FC maps rather
than on the normal FOM*FO maps, since the "missing" parts of the
structure will then contribute more to the maps. Thus one must assure
that both Fo and Fc are present on the phase files. This all can be
accomplished without modification to the "doall" script by doing the
following:
1) Generate phases from the partial structure in PHASIT, structure
factor calculation mode, requesting the "short form" output, i.e.
using IHLCF=0, ISIGA=0. This file contains both Fo and Fc, and should
be called "phasit.31". (Don't worry that the file does not contain
distribution coefficients as a normal "phasit.31 file would, in this
case the file will be used only to seed the process by creating the
first map.)
2) Generate the same phases again from the partial structure in
PHASIT, structure factor calculation mode, but this time request the
"long form" output, i.e. using IHLCF=1, ISIGA=0 or 1. Call this file
"anchor.31". It includes probability distribution coefficients and
will serve as the "anchor" phases, which map inverted phase
information will be combined with on each iteration.
3) Modify the fft.d input file to request a 2Fo-Fc map instead of
an Fo map.
4) Modify the bnd3.d (and possibly extnd.d) input files to specify
"anchor.31" to be used instead of "phasit.31" for the anchor phase
set, and set IOTYP=1 so that both Fo and Fc appear on the output file.
5) Modify the rmhv.d input file to specify that no heavy atoms,
i.e. 0 input atoms, are used.
You can now run the "doall" procedure, and 2Fo-Fc maps will be used
for all calculations including those used during mask construction.
Note however, that the output files (phi4cy.31, phi8cy.31, phi16cy.31
etc.) will now contain Fo and Fc in the amplitude slots instead of
the normal fom*fo and fo, thus they can NOT be used to compute figure
of merit weighted maps in FSFOUR. They can however, be used for Fo
and difference type maps. The figure of merit is still present in
the file, but it will not be applied in FSFOUR.
Finally, if non-crystallographic symmetry averaging is to be
performed in addition to solvent flattening one can continue the
process as described in the AVERAGING DIFFERENCE or 2FO-FC MAPS
section.
12.00 PHASE EXTENSION
Phases (and optionally amplitudes) can be extended, either to
higher resolution or to missing reflections within the original
resolution limit, by modifying an electron density map created with
the known phases, inverting the modified map and combining the map
inverted structure factors with the initial data via option 3 of the
BNDRY program. To extend phases to reflections for which amplitudes
are available, a file must first be created using the program MISSNG,
which generates a list of reflections (file "extrfl.d" for which at
least amplitudes are available, and possibly phase information. Then
the file "sloext.d" must be created (see write-up for SLOEXT) to
control the limits and rate of phase extension, and the file "extnd.d"
or "extnda.d" must be created to control the phase combination step in
BNDRY. Once these files are created, and assuming that the final
solvent mask (mask3.14), phase files (phasit.31 & phi16cy.31), and
input data files (eg. fft.d minv2.d, bnd2.d) from a prior "doall"
solvent flattening run are still available, one can execute the
"extnd.sh" script to carry out the phase extension iterations. The
resolution is gradually extended out to the limit specified in the
sloext.d file, with the final phases written to "phiextnd.31". One can
do phase AND AMPLITUDE extension similarly, by preparing "extnda.d"
and "sloext2.d" files, and executing "extnda.sh".
The phase extension process is even more powerful if density
modification in addition to solvent flattening/negative density
truncation is included, such as noncrystallographic symmetry averaging.
In such cases the script "extndavg.sh" can be used, which requires all
of the files utilized by "extnd.sh", plus the "extrmap.d", "mapavg.d",
"bldcel.d" and "asu.msk" files needed for NC symmetry averaging (see
the noncrystallographic symmetry section of the write-up).
When doing phase extension to higher resolution it is important to
compute the map on a grid sampled AT LEAST one third the smallest d
spacing to be encountered anywhere in the process, and to specify
Miller index ranges during map inversion ("minv2.d" file) which
satisfy the highest resolution desired. Note that if the extension
is substantial, this may require regenerating the solvent mask
(and therefore the averaging mask "asu.msk") on a finer grid than
was originally used. Best results are obtained when the extension
is carried out slowly, with at least 5 iterations per extension
step. Sample scripts and template files are provided for both UNIX
systems (as described here), and VMS symtems (with the corresponding
data files having ".dat" extensions and the control files having
".com" extensions. Samples are also given in the EXAMPLES section.
13.00 MAD PHASING
The PHASES package can be used to determine phase angles from MAD
(Multiple wavelength Anomalous Dispersion) data, by treating the data
from each wavelength as a native anomalous scattering, isomorphous
replacement or derivative anomalous scattering data set, and then
combining information from all sets in the conventional manner. This
is facilitated by the ability to input scattering factor information
to the PHASIT program. One simply adjusts the input scattering
factors and data appropriately for the wavelength and data set type
desired. For example, consider the case where data has been measured
at three wavelengths, and Bijvoet pairs were measured in all sets. A
reasonable strategy would be to:
1) Select one of the data sets to be the "native." For this set we
would prefer no Bijvoet signal to be present, so we might pick
a wavelength where delta f" is near zero. Note however, that we can
use any wavelength, even if delta f" is large, provided we first
AVERAGE both members of the Bijvoet pair for each acentric reflection.
Indeed, it is most desirable to choose a wavelength where delta f'
is a minimum or maximum, allowing delta f" to be appreciable but it's
effects can be reduced or removed by the averaging. Thus if delta f"
is large be sure to include acentric reflections ONLY IF BOTH MEMBERS
OF THE BIJVOET PAIR WERE EXPLICITLY MEASURED AND AVERAGED! This is the
data set that will actually be phased and eventually used for map
calculations, thus it should include the centric reflections as well.
All other sets will be scaled to it. Since the REAL part of the
anomalous scattering correction is NOT removed by the averaging, we
will need to know what delta f' is at this wavelength. Let the real
and imaginary components of the anomalous dispersion scattering
factor corrections at this wavelength be called delta f'(N) and
delta f"(N).
2) Select another data set at a wavelength D1 which maximizes the
magnitude of ( delta f'(D1) - delta f'(N) ). For this set, average
the Bijvoet pairs for all acentric reflections as in (1), to remove
the contribution from delta f"(D1), and include the centric data
as well. This set can then be merged with the "native" in CMBISO
to form an "isomorphous" derivative scaled set. It can then be used
in PHASIT as an SIR data set, but since the only difference in the
scattering between this and the "native" is due to differences in
delta f', you must input the appropriate scattering factors. Thus
input zeroes for the 9 normal scattering factor coeficients, but
input ( delta f'(D1) - delta f'(N) ) for the REAL part, and
delta f"(D1) for the IMAGINARY part of the anomalous correction.
Be careful of the sign when doing the subtraction. (Note that the
delta f"(D1) term will not be used in the SIR calculation, thus it
does not have to be input as zero which you might have expected).
(3) Select another data set at a wavelength D2 which maximizes
delta f"(D2). For this set, DO NOT average the Bijvoet pairs. Simply
merge the data with the "native" set (created in (1)) with CMBANO to
generate a scaled "anomalous" set. The output data can then be used
in PHASIT as a "derivative anomalous scattering" data set, but since
the difference in scattering between this and the "native" is due to
both the difference in delta f' and the effect of delta f"(D2),
you must again adjust the scattering factors accordingly. Input zeros
for the 9 normal scattering factor coeficients, and input
( delta f'(D2) - delta f'(N) ) for the REAL and delta f"(D2) for the
IMAGINARY parts of the anomalous scattering correction. Again, be
careful of the sign when doing the subtraction. In this case both
the real and imaginary components will be used.
The "isomorphous" and/or "anomalous" scaled files prepared in (2)
and (3) can initially be used in the normal manner to locate the
anomalously scattering atoms from difference or Bijvoet difference
Patterson maps (see flowchart section), and possibly for initial
heavy atom refinement in GREF. Note however, that if one is using
"isomorphous" data sets as described in (2) above with program
MRGDF to compute difference or cross difference Fouriers and the
sign of ( delta f'(D1) - delta f'(N) ) is negative, then peaks in
the map corresponding to the isomorphous scatterers should also be
negative. In that case one can then request that program PSRCH list
only negative peaks to check the sites. This is necessary ONLY for
ISOMORPHOUS data sets in which the REAL part of the derivative
minus native scattering factor is expected to be negative. The
same thing holds for "difference" or "double difference" Fouriers
computed with the difference files generated by program PHASIT
when MAD data sets are used and the file corresponds to an
ISOMORPHOUS data set with (delta f'(D) - delta f'(N)) negative.
Once the anomalous scatterers have been found, the isomorphous
and anomalous scaled data from (2) and (3) can be used simultaneously
in PHASIT to compute SIRAS phases, which can then be used for map
computations or solvent flattening in the normal manner. One should
first carry out phase refinement in PHASIT, refining the heavy atom
parameters and scale factors. If the wavelengths were chosen as
described above these two sets should provide the greatest phasing
power, since the isomorphous and anomalous signals were maximized
in each case. It is not necessarily all that can be done however.
For example, the same data set (averaged Bijvoet mates) which was
utilized in (2) to create an "isomorphous" set, can also be
processed (without averaging Bijvoet mates) as in (3) to create
another "derivative anomalous scattering" set. In that case the
appropriate anomalous scattering correction factors would be
( delta f'(D1) - delta f'(N) ) and delta f"(D1). Likewise, the
data set (unaveraged Bijvoet mates) used in (3) to get the
derivative anomalous set, can also be processed (after averaging
Bijvoet mates) as in (2) to get another "isomorphous" set. For the
new isomorphous set the appropriate anomalous scattering correction
factors would be ( delta f'(D2) - delta f'(N) ) and delta f"(D2).
Finally, if the original "native" data set was collected at a
wavelength where delta f"(N) is appreciable, then it too can also be
included (without averaging Bijvoet mates) as a "native anomalous
scattering" data set. In that case the appropriate anomalous
scattering correction factors would be delta f'(N) and delta f"(N).
(Note that the real part will not be used in the calculations, and
in PHASIT native anomalous scattering data sets should come last in
the input). Thus with data at three wavelengths, one can combine up
to 5 different sources of phase information in PHASIT: two
isomorphous sets, two derivative anomalous scattering sets, and one
native anomalous scattering set. Although some of these sets provide
essentially the same (redundant) information, the experimental errors
will be different in each set, thus inclusion of all of them may still
be helpful. If more than the 2 optimal sets are used, the figures of
merit will be somewhat overestimated, but the phases still can
improve a little. It may be useful to try various combinations.
MAD PHASING AT TWO WAVELENGTHS
A procedure similar to that above can be used when anomalous
scattering data has been collected only at two wavelengths. In
that case a "native" set is selected and processed as in (1), to
remove contributions from delta f"(N). The other data set is
first processed and merged with the native as in (2) to create an
"isomorphous" set, and then processed and merged again (this time
NOT averaging Bijvoet mates) as in (3) to create a "derivative
anomalous scattering" set. If the "native" set was taken at a
wavelength where delta f"(N) is appreciable, then the original
"native" data (this time WITHOUT averaging Bijvoet mates) can also
be used as a "native anomalous scattering" data set. Thus even
with data at only two wavelengths it is possible to obtain and
combine phase information from three sources, native anomalous
scattering, derivative isomorphous replacement and derivative
anomalous scattering.
14.00 VMS USER INFORMATION
Use of the PHASES package on VMS systems is very similar to its use
on UNIX systems, except that command files ".com" are used instead of
".sh" shell scripts. In all cases the programs function identically,
and all input and ouput is the same. A command procedure to be
executed from each users login.com file will define all of the programs
so that they can be run simply by entering the program name (as in
UNIX systems). However, one can not use the UNIX input and output
redirection operators (<, <<, >, >>), so that for the non-interactive
programs input data must either immediately follow the program name
(on subsequent lines), or come from a file which has been "ASSIGNed"
to FOR005 (and "DEASSIGNed" upon program completion). Likewise, to
direct standard output to a file one must ASSIGN the file to FOR006.
Also, the different byte order and floating point format makes it
difficult, if not impossible, to use any binary files on other
computer systems. This is not normally a problem since the only binary
files which typically need to be transferred are graphics map files for
use with programs TOM, O or CHAIN on graphics workstations. The VMS
version of program GMAP which creates these files contains special
code within it such that the binary map files it produces may be
transferred (via ftp, type binary) and used DIRECTLY on the Silicon
Graphics or ESV workstations where the graphics programs will be run.
Installing the PHASES package on a VMS system will involve four
steps. The first three steps should be done only once, whereas
the last step must be repeated any time a new user of the package
is added to the computer system. The steps are:
1) Edit the file "set_phases.com", which is present in the parent
directory as distributed. Only one line has to be changed, so that
the logical name "PHASES_DIR" points to the parent directory where
the software resides.
2) If one is on a DEC or VAX station instead of an ALPHA workstation,
one MAY have to edit the file "XSTUFF.OPT" (in the [.src] directory
below the parent directory) to point to the directory where the
systems X-Window object libraries are located. The supplied version
is appropriate for ALPHA workstations, but may need to be changed
for other workstations.
3) From the parent directory, type @BUILDIT.COM to invoke the
compilation and linking.
4) Have each user of the package insert the line
$@DISK:[DIRECTORY]SET_PHASES.COM
in his/her login.com file. Note that the DISK and DIRECTORY should
be changed to point to the appropriate PHASES parent directory as
in (1).
The installation will result in files being deposited in four
subdirectories below the parent directory. The subdirectories contain
the program source files, executables, write-up and sample template
files. The four directories have the logical names PHASES_SRC,
PHASES_EXE, PHASES_DOC and PHASES_TEMPL, respectively. Users may find
it desirable to copy the write-up and sample template files to their
own directory, thus the commands
COPY PHASES_DOC:PHASES.WUP *
COPY PHASES_TEMPL:*.com *
COPY PHASES_TEMPL:*.DAT *
will probably be useful. It is also recommended that at least one copy
of the PHASES.WUP manual be printed, but beware as it is large
(roughly 190 pages).
After the installation and execution of the login.com file, any
program in the package can be run simply by typing its name. Note
that if running in batch mode, one will have to insert the line
$SET DEFAULT DISK:[DIRECTORY]
in the beginning of every ".com" file, where DISK and DIRECTORY are
replaced by the users working disk and directory if the users input
data files are to be found.
15.00 For UNIX, the following scripts are invoked by the doall script
--- procedure mask1.sh ---
# COMPUTE ORIGINAL ELECTRON DENSITY MAP
#
ln phasit.31 four.ref
fsfour < fft.d > mask1.l
mv four.map orig.map
rm four.ref
ln orig.map four.map
#
# REMOVE HEAVY ATOM PEAKS FROM MAP
#
rmheavy < rmhv.d >> mask1.l
mv nohv.map four.map
#
# INVERT MAP AFTER TRUNCATING DENSITY < 0
#
mapinv < minv1.d >> mask1.l
rm four.map
#
# MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
#
bndry < bnd0.d >> mask1.l
rm minv.ref
#
# COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
#
fsfour < fft.d >> mask1.l
rm four.ref
#
# DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
#
bndry < bnd1.d >> mask1.l
mv mask.map mask1.14
rm four.map
# THATS ALL
--- procedure cycle4.sh ---
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK1
#
# use .06 for 3A data and .086 for 3.5 and .112 for 4.A
mv orig.map four.map
ln mask1.14 mask.map
bndry < bnd2.d > cycle4.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle4.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle4.l
#
#
# PERFORM 3 MORE CYCLES OF REFINEMENT
#
for cycle
in 1 2 3
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> cycle4.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask1.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> cycle4.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle4.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle4.l
done
mv newphi.ref phi4cy.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure mask2.sh ---
#
# COMPUTE ELECTRON DENSITY MAP
#
ln phi4cy.31 four.ref
fsfour < fft.d > mask2.l
rm four.ref
#
# REMOVE HEAVY ATOM PEAKS FROM MAP
rmheavy < rmhv.d >> mask2.l
mv nohv.map four.map
#
# INVERT MAP AFTER TRUNCATING DENSITY < 0
#
mapinv < minv1.d >> mask2.l
rm four.map
#
# MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
#
bndry < bnd0.d >> mask2.l
rm minv.ref
#
# COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
#
fsfour < fft.d >> mask2.l
rm four.ref
#
# DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
#
bndry < bnd1.d >> mask2.l
mv mask.map mask2.14
rm four.map
# THATS ALL
--- procedure cycle8.sh ---
#
# START OVER USING NEW MASK
#
cp phasit.31 newphi.ref
ln phasit.31 minv.ref
#
#
# PERFORM 4 CYCLES OF REFINEMENT, USING SECOND MASK
#
for cycle
in 1 2 3 4
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> cycle8.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask2.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> cycle8.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle8.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle8.l
done
mv newphi.ref phi8cy.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure mask3.sh ---
#
# COMPUTE ELECTRON DENSITY MAP
#
ln phi8cy.31 four.ref
fsfour < fft.d > mask3.l
rm four.ref
#
# REMOVE HEAVY ATOM PEAKS FROM MAP
rmheavy < rmhv.d >> mask3.l
mv nohv.map four.map
#
# INVERT MAP AFTER TRUNCATING DENSITY < 0
#
mapinv < minv1.d >> mask3.l
rm four.map
#
# MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
#
bndry < bnd0.d >> mask3.l
rm minv.ref
#
# COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
#
fsfour < fft.d >> mask3.l
rm four.ref
#
# DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
#
bndry < bnd1.d >> mask3.l
mv mask.map mask3.14
rm four.map
# THATS ALL
--- procedure cycle16.sh ---
#
# START OVER USING NEW MASK
#
cp phasit.31 newphi.ref
ln phasit.31 minv.ref
#
#
# PERFORM 8 CYCLES OF REFINEMENT, USING THIRD MASK
#
for cycle
in 1 2 3 4 5 6 7 8
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> cycle16.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> cycle16.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle16.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle16.l
done
mv newphi.ref phi16cy.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure extnd.sh ---
#
# RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES
#
cp phi16cy.31 newphi.ref
ln phasit.31 minv.ref
#
# CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
sloext < sloext.d > extnd.l
#
# PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE
# THIRD MASK
#
# DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
# AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
# OF ITERATIONS IS REACHED)
while
test -r EXTND.TMP
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> extnd.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> extnd.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> extnd.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < extnd.d >> extnd.l
#
# ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
# DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
# THE LOOP)
sloext < sloext.d >> extnd.l
#
done
#
mv newphi.ref phiextnd.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure extnda.sh ---
#
# RESUME WHERE WE LEFT OFF
#
cp phiextnd.31 newphi.ref
ln phasit.31 minv.ref
#
# CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
sloext < sloext2.d > extnda.l
#
# PERFORM THE PHASE AND AMPLITUDE EXTENSION ITERATIONS USING
# THE THIRD MASK
#
# DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
# AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
# OF ITERATIONS IS REACHED)
while
test -r EXTND.TMP
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> extnda.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> extnda.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> extnda.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < extnda.d >> extnda.l
#
# ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
# DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
# THE LOOP)
sloext < sloext2.d >> extnda.l
#
done
#
mv newphi.ref phiextnda.31
mv minv.ref allcoef.31
# THATS ALL