PHASES
                                           
           PHASES is a package of computer programs designed to compute phase 
        angles for diffraction data from macromolecular crystals. The package 
        is complete in that it contains programs for the following: merging 
        and scaling of native and derivative data sets; analyzing difference 
        statistics; computing Patterson and electron density maps; searching 
        for peaks; refining heavy atoms (or protein domains as rigid groups); 
        computing phases by MIR (multiple isomorphous replacement), SIR 
        (single isomorphous replacement), SAS (single wavelength anomalous 
        scattering), SIRAS (single isomorphous replacement supplemented with 
        anomalous scattering), MIRAS (multiple isomorphous replacement 
        supplemented with anomalous scattering) or from atomic coordinates for 
        an input model; noncrystallographic symmetry averaging; combining 
        phases from a partial structure with MIR etc phases; computation and 
        analysis of cross difference or Bijvoet difference Fourier maps; and 
        for phase extension and refinement.
           Once an initial set of phases is generated, programs are included 
        to improve them by carrying out solvent levelling with negative 
        density truncation and/or combination with model based phase 
        information and/or averaging over noncrystallographic symmetry.  
        Solvent levelling is facilitated by the automatic protein-solvent 
        boundary determination method (Wang, in Methods in Enzymology 115, 
        1985) which is implemented here entirely in reciprocal space in a much 
        more efficient manner than in previous programs. If applied to SIR or 
        SAS starting phases, the programs can also carry out the ISIR or ISAS 
        phasing procedures described by Wang. The package
        consists of 5 major programs and many utility programs as follows:
        
            PROGRAM                          FUNCTION           
             
           PHASIT.F            Computes MIR, SIR, MODEL etc phases from
                               input atomic parameters and diffraction
                               data. Can refine heavy atoms or derivative
                               scaling parameters in "phase refinement"
                               mode. 
            
           BNDRY.F             Computes coefficients for automatic
                               boundary determination. Determines
                               protein-solvent boundary mask, flattens
                               solvent and applies negative density
                               truncation, combines phases from external
                               sources (map inversion or from partial
                               structures) with original phase information,
                               extends phases to higher resolution. 
            
           FSFOUR.F            Space group general 3D FFT program for
                               electron density calculations.
            
           MAPINV.F            Space group general 3D FFT program for
                               structure factor calculations.
                   *
           MAPVIEW.F                   Interactive contouring/map viewing program.
                               Allows user to view maps and masks, and
                               trace/edit solvent or averaging masks. 

           CTOUR.F              Creates contoured plots from FSFOUR maps,
                                either as individual sections, mono or
                                stereo projections. Plots can be viewed
                                directly or converted to PostScript.
        
           GMAP.F               Extracts region from a FSFOUR map and 
                                creates corresponding maps for the graphics
                                programs TOM, O or CHAIN. Also can create
                                skeleton files for TOM or O.

           MISSNG.F            Selects reflections for phase extension.
           
           MRGDF.F             Generates coefficients for isomorphous
                               difference Fourier or cross Fourier.
            
           MRGBDF.F                    Generates coefficients for Bijvoet
                               difference Fourier or cross Fourier.
            
           RD31.F              Converts internal binary file to ASCII
                               for examination and/or editing.
            
           MK31B.F             Restores ASCII version of file to binary.
                   
           PSRCH.F             Searches Fourier map and lists unique peaks.
        
           CMBISO.F                    Combines native and derivative isomorphous
                               replacement data into one file and scales
                               the derivative data to the native.
        
           CMBANO.F            Combines native data and derivative
                               anomalous scattering data into one file
                               and scales the derivative data to the native.
        
           TOPDEL.F            Examines isomorphous/anomalous scattering
                               differences, identifies and rejects outliers,
                               prepares file for difference Pattersons.
        
           GREF.F              Refines heavy atom parameters against
                               isomorphous or anomalous scattering
                               differences; refines protein domains,
                               substructures etc as rigid groups against
                               native data.   

           RMHEAVY.F            Temporarily removes density in map from heavy
                                atoms, to aid in accurate solvent mask
                                generation.
        
           IMPORT.F            Allows user to introduce his own phases and
                               Hendrickson-Lattman coefficients (computed
                               by external programs) into the PHASES package
                               for subsequent calculations. This allows one
                               to bypass the PHASIT program. 

           XPL_PHI.F            Creates input reflection file for XPLORE
                                from a PHASES style phased file.
                    *
           PRECESS.F            Lets one construct and interactively examine
                                "pseudo" precession or "pseudo" difference
                                precession photos made from reflection files.
                    *
           VIEWPLT.F            Displays up to 10 plots created by CTOUR
                                on workstation or X-Window capable monitor.

           PLTTEK.F             Displays plots created by CTOUR on terminals
                                capable of using TEKTRONIX 4010 emulation.

           MKPOST               Converts plots created by CTOUR to PostScript.

           PDB_CDS.F            Converts coordinate files between PDB and
                                PHASES formats, and vice versa.

           EXTRMAP.F                   Extracts a region (submap) from the standard 
                                FSFOUR map for use in averaging, skewing etc.
        
           EXTRMSK.F                   Extracts a region (submap) from the standard
                                solvent mask for possible editing, skewing
                                etc.
        
           MAPAVG.F            Averages one or more maps to impose non-
                               crystallographic symmetry.
        
           MAPORTH.F                   Orthogonalizes non-orthogonal map (and
                               optionally mask) for use in refinement of
                               noncrystallographic symmetry operator.
        
           LSQROT.F            Refines purely rotational noncrystallographic
                               symmetry operator against electron density.
        
           LSQROTGEN.F         Refines general noncrystallographic symmetry
                               operator (arbitrary rotational angle, with
                               translation) against electron density.
        
           SKEW.F              Skews a map (and optionally a mask) to
                               a new, and arbitrarily oriented cell.
           
           BLDCEL.F            Rebuilds a complete unit cell map (and
                               optionally, mask) from an input asymmetric
                               unit submap (and optionally, mask).
        
           MDLMSK.F            Creates a mask from coordinates in an input
                               atomic model, for use in averaging, NC
                               symmetry operator refinement or use in
                               solvent flattening.
        
           MRGMSK.F            Merges multiple masks created by MDLMSK
                               into a single mask.
        
           TRNMSK.F            Transforms mask created in a "skewed"
                               cell back to the normal cell.
       
           HNDCHK.F             Interpolates density from a map at
                                specified sites, usually for the purpose
                                of determining the proper hand. 

           SLOEXT.F             Controls number of iterations and rate of
                                phase extension to higher resolution.
 
           RDHEAD.F                    Dumps header from averaging map (submap) or
                                mask files for examination. 
       
           O_TO_SP.F            Extracts spherical polar angles and
                                axis location for use in PHASES from
                                rotation matrix/translation vector
                                produced by program "O" 

           PSTATS.F             Tabulates mean phase difference between
                                two phase sets as a function of d spacing.
 

                                   TABLE OF CONTENTS
                                         for
                                   GENERAL PROCEDURES


               REFERENCING THE PHASES PACKAGE .......... 0.00

               GETTING STARTED ......................... 1.00
                 Accessing on line documentation ....... 1.01
                 Template scripts and files ............ 1.02 
                 Flow Charts ........................... 1.03
                 File Formats .......................... 1.04
 
               EXAMPLES ................................ 3.00
                 Pamfile ............................... 3.01
                 Initial phasing ....................... 3.02
                 Solvent levelling ..................... 3.03
                 Doall scripts ......................... 3.04
                 Expected output ....................... 3.05

               NATIVE, DIFFERENCE AND "CALCULATED"
               PATTERSON MAPS .......................... 4.00

               REFINING HEAVY ATOM PARAMETERS .......... 5.00

               HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE
               AND CROSS DIFFERENCE FOURIER MAPS ....... 6.00

               CREATING/EDITING SOLVENT MASKS .......... 7.00

               INCORPORATION OF PARTIAL STRUCTURES ..... 8.00

               REDUCED BIAS NATIVE, COMBINED AND 
               DIFFERENCE FOURIER MAPS ................. 9.00

               INCORPORATION OF NONCRYSTALLOGRAPHIC
               SYMMETRY AVERAGING ..................... 10.00
                 Averaging with Multiple Crystals ..... 10.01
                 Averaging Difference or 2FO-FC Maps .. 10.02
                 Sample Input Files for Averaging ..... 10.03                

               DENSITY MODIFICATION WITH MOLECULAR
               REPLACEMENT DERIVED PHASE INFORMATION .. 11.00 

               PHASE EXTENSION ........................ 12.00
       
               MAD PHASING ............................ 13.00
 
               UNIX SHELL SCRIPTS ..................... 15.00

 
        0.00               REFERENCING THE PHASES PACKAGE 

           When publishing results obtained from use of the software, a 
        statement should be included like "all heavy atom refinement, phasing,
        solvent flattening, noncrystallographic symmetry averaging, map
        calculations etc. (or whatever is appropriate) were carried out with
        the PHASES package (Furey & Swaminathan, 1995)." This refers to the
        following paper:

        "PHASES-95: A Program Package for the Processing and Analysis of
        Diffraction Data from Macromolecules", W. Furey & S. Swaminathan,
        in MACROMOLECULAR CRYSTALLOGRAPHY, a volume of Methods in Enzymology,
        eds. C. Carter & R. Sweet, Academic Press, Orlando, Fl. (1996), in
        press.


        1.00                      GETTING STARTED
        
           The first thing to do is to prepare an input parameter file 
        specifying the cell constants, symmetry information etc. This file is 
        referred to as the "standard parameter file" throughout the PHASES 
        package, and is often called "PAMFIL" generically in specific program 
        writeups. One should select a name for it which is indicative of the 
        particular structure being worked on, and rapidly communicates to the 
        user that it is a parameter file. For example, PDC.PAM might be a good 
        choice for phasing pyruvate decarboxylase. The main purpose of this 
        file is to insure consistency in cell constants, symmetry, lattice 
        type etc throughout all programs, and to eliminate redundant input of 
        these parameters by the user. In addition one can optionally specify 
        the name of a "running log file." If this is done then in addition to 
        normal output to either the screen or individual log files for each 
        program, a copy of all printed output is also appended to a single
        file, preceeded by a time stamp indicating what program was run and
        when. Thus one can maintain a complete history of all computations
        and results in a single log file.
        
           Each standard parameter file should contain the following 
        information in the indicated sequence.
        
                                          
        LOGFILE=FILNAME                Where FILENAME is the name of the
                                       desired "running" log file. If no
                                       cumulative log is desired, enter
                                       LOGFILE=NULL
                                       There must be no spaces immediately
                                       preceeding or following the "=". Upper
                                       or lower case is permitted.
        
        
        LATTICE=X                      Where "X" is either P,A,B,C,I,F or R
                                       There must be no spaces immediately
                                       preceeding or following the "=". Upper
                                       or lower case is permitted for the word
                                       LATTICE, but only UPPER case for the
                                       single character symbol.
                                      
        
                                                        
        A, B, C, ALPHA, BETA, GAMMA    Unit cell constants, in angstroms and
                                       degrees. Readable in free format, i.e.
                                       at least one blank or comma separating
                                       entries.
                                               
              
        NSYM                           Number of equivalent positions in
                                       the space group. Do NOT include 
                                       additional translations associated
                                       with centering conditions for 
                                       non-primitive lattices, i.e. for
                                       space group C2  NSYM=2. (this entry
                                       read in free format).
        
         
           The  NSYM symmetry operators follow, one operator per line EXACTLY 
        as indicated in the International Tables for X-Ray Crystallography. 
        The first operator should ALWAYS be X,Y,Z. Note that for rhombohedral
        lattices the HEXAGONAL CELL AND SYMMETRY OPERATORS SHOULD BE USED,
        along with the lattice type R.
        
        
           The following sample serves as a complete template for a parameter
        file, for space group P2(1)2(1)2(1)
        
        LOGFILE=seb.rlog
        LATTICE=P
        45.331 68.33 79.62 90. 90. 90.
        4
        X,Y,Z
        1/2-X,-Y,1/2+Z
        1/2+X,1/2-Y,-Z
        -X,1/2+Y,1/2-Z
        
           Once a suitable parameter file is created, the phasing process can 
        begin.
                
                  One starts phasing by preparing one or more "scaled" or "merged"
        files containing x-ray diffraction data. The files will vary
        depending on whether isomorphous replacement or anomalous scattering
        data is to be used for phasing. Each file should be ASCII (read in
        free format) with all records containing the same type of information.
        Each record should contain
        
        H, K, L, FP, Sig(FP), FPH, Sig(FPH)    for isomorphous replacement
            
                         or
            
        H, K, L, F+, Sig(F+), F-, Sig(F-)      for native anomalous scattering 
            
                         or
        
        H, K, L, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-)
        
                                           for derivative anomalous scattering
                                                                      
            where 
        
        H, K, L    =  Miller indices (integers).
            
        FP, FPH    =  Native and Derivative structure factor amplitudes
            
        F+, F-     =  Structure factor amplitudes for reflection. F+
                      corresponds to indices H, K, L, F- to -H,-K,-L.
            
        FPH+, FPH- =  Derivative structure factor amplitudes. FPH+ corresponds
                      to indices H, K, L, FPH- to -H, -K, -L.
        
        Sig(X)     =  Estimated standard deviation for quantity X. 
        
           A separate file should be prepared for each derivative/anomalous 
        scattering data set. For isomorphous replacement and derivative 
        anomalous scattering data the FPH values should have already been 
        properly scaled to the FP values. If more than one data set is to be 
        used for phasing, then ALL F VALUES SHOULD BE ON THE SAME SCALE. 
        Indeed, for MIR phasing it is best to keep corresponding FP values 
        IDENTICAL in each data set. The "scaled/merged" files are usually
        prepared by the programs CMBISO or CMBANO and are generally given
        filenames ending in ".scl", but they can also be generated externally
        by the user. It is always desirable however, to use the ".scl"
        ending as some of the programs in PHASES will deduce the file format
        from the ending of the filename. Once these files are prepared, they
        can be used to create difference Pattersons to identify heavy atom
        sites. A control file containing heavy atom parameters (either for
        the derivative, or anomalous scatterers) must then be prepared, and
        GREF or PHASIT can be run. If PHASIT is simply used to compute
        structure factors from a model, then the ".scl" files are not needed,
        but reflection and coordinate files must still be supplied. One
        can use the phase file output from PHASIT directly to compute an
        electron density map with FSFOUR, or the file can be used with
        programs BNDRY, FSFOUR and MAPINV to carry out solvent levelling,
        negative density truncation, phase extension and phase combination
        analogous to Wang's ISIR procedure. If the latter is selected, then
        the file output from PHASIT should be named "phasit.31"
        
           In general, programs CMBISO and/or CMBANO are used to prepare all 
        reflection data files. Then TOPDEL is run to reject outliers and 
        select data for difference Patterson calculations to be performed by 
        FSFOUR. The Patterson map can be interactively contoured and examined
        in MAPVIEW, searched for peaks in PSRCH, or contoured to generate
        hard copies as PostScript files with CTOUR and MKPOST. Once heavy atom
        locations are identified, they can be refined by GREF or PHASIT. The
        heavy atom parameters and data are then used in PHASIT to compute SIR,
        MIR phases etc. Next MISSNG is run followed by the solvent
        flattening/negative density truncation/phase extension iterations
        carried out by BNDRY and invoked by the procedure DOALL. If more than
        one derivative is needed, or one wants to search for additional heavy
        atom sites, programs MRGDF and/or MRGBDF can then be used to create 
        difference or cross difference coefficients, FSFOUR computes the map,
        and MAPVIEW, PSRCH or CTOUR are used to identify peaks again. One can
        also use the difference coefficients files produced by PHASIT to
        compute "double difference" type maps to search for minor sites. The
        new heavy atom parameters are then included in PHASIT, and the process
        is repeated. This procedure can be cycled over as many derivatives
        or data sets as needed. As a final step, it is often useful to hold
        the "solvent flattened" phases fixed in PHASIT, and refine the heavy
        atom parameters again. This final set of heavy atom parameters is then
        used to compute final MIR etc phases in PHASIT, which are then used to 
        start a final round of solvent flattening. The final map resulting 
        from these phases can be interactively contoured and examined in
        MAPVIEW, converted to graphics map format (e.g. for TOM, O or CHAIN)
        and skeletonized by GMAP, or hard copies can be prepared by CTOUR and
        MKPOST. 

        1.01              ACCESSING ON LINE DOCUMENTATION

           The complete PHASES manual (what you are reading now) is maintained
        online in the file PHASES.WUP. This file generally resides in the top
        level of the PHASES directory, which initially is a subdirectory under
        "export" (on UNIX systems), but its location may vary depending on how
        one installs the software. On OpenVMS systems it can be accessed by
        referring to PHASES_DOC:PHASES.WUP (if one installs the software as
        described later). It is recommended that each user make a copy of the
        manual in his own working directory so it can be examined without
        fear of destroying the original. The manual is a simple ASCII text 
        file and can be examined in the editor of your choice. All program
        write-ups begin with the program name followed by a single space and
        then by the word "WRITE-UP" (all in uppercase), so that, for example,
        to get to the write-up for program FSFOUR one can simply enter an
        editor and search for "FSFOUR WRITE-UP" or just "FSFOUR W". This will
        position the editor at the appropriate place in the manual. Just be
        sure to exit the editor without making any changes. Indeed, it may be
        desirable to set the file protection so that it can be read but not
        written.

        1.02                 TEMPLATE SCRIPTS AND FILES 

          Included with the PHASES distribution are a series of sample control
        files (*.sh or *.com files) as well as sample input data (*.d or *.dat
        files). As initially distributed, these files reside in the top level
        of the PHASES directory (itself a subdirectory under "export" on UNIX
        systems, or in PHASES_TEMPL, if installed as suggested in the "VMS
        USER INFORMATION" section). The "*.sh" files are UNIX shell scripts
        to invoke one or more programs, while the "*.com" files accomplish the
        same tasks under OpenVms. Similarly, the "*.d" and "*.dat" files are
        sample data inputs for programs under the UNIX and OpenVms operating
        systems, respectively. Generally the "*.d" and "*.dat" files are
        identical. It is suggested that each user copy these files to his
        working directory to serve as templates for new applications. This
        will minimize the possibility of typing errors, and also serve as an
        example for a particular calculation. Indeed, it may be desirable to
        open two windows, one editing the template file and the other 
        positioned to examine the appropriate write-up as described in the
        preceeding section.
            
        1.03                        FLOW CHARTS



                          Native file         Derivative file
                              .                      .
                              .                      .
                              v                      v
                        ************************************
                        *        CMBISO  or CMBANO         *
                        ************************************
                                         .
                                         .
                                         .
                               "Scaled/Merged" file
                                         .
                                         .
                                         v
                                 *****************
                                 *    TOPDEL     *
                                 *****************
                                         .
                                         .
                                    "Patt" file
                                         .
                                         .
                                         v
                                 *****************
                                 *     FSFOUR    *
                                 *****************
                                         .
                                         .
                                    "Map" file
                                         .
                                         .
                                         v
                ....................................................
                .                        .                         .
                .                        .                         .
                v                        v                         v
         ****************        ******************          *************
         *     PSRCH    *        *     MAPVIEW    *          *   CTOUR   *
         ****************        ******************          *************
    
 

          Path for initial processing of a derivative data set, includes 
        merging and scaling native and derivative data, rejecting outliers,
        computing difference Patterson maps and examination.




           "Scaled/Merged" file(s)                         
                      .                                  
                      .
                      v                         
               ***************                       "Phased" file,
               *   PHASIT    *                       from BNDRY after
               ***************                       solvent flattening
                  .        .                                 .
                  .        .                                 .     
                  .   "Phased" file                          .
                  .        .                                 .
                  .        .                                 .
                  .        v                                 v
                  .        ...................................
                  .                        .
                  .                        .           
                  .                        v
                  .              *********************
                  .              *  MRGDF or MRGBDF  *X-- "Scaled/Merged
                  .              *********************        file" 
           "difference file"               .
                  .                        .
                  .                "Cross phase" file
                  .                        .
                  .                        .
                  .                        v
                  .              *********************
                  ..............X*      FSFOUR       *
                                 *********************
                                           .
                                           .
                                      "Map" file
                                           .
                                           .
                                           v
                  ....................................................
                  .                        .                         .
                  .                        .                         .
                  v                        v                         v
           ****************        ******************          *************
           *     PSRCH    *        *     MAPVIEW    *          *   CTOUR   *
           ****************        ******************          *************



           Paths for generating and examining "cross difference" Fourier, 
           "cross Bijvoet difference" Fourier, "double difference" Fourier
           or "double Bijvoet difference" Fourier, started either by
           generating SIR, MIR etc phases, or using "solvent flattened"
           phases.






                                                             Native file
                                                                  .
                                                                  .
                                                                  v
                       *****************   "Phased"        ***************
             ----------*     PHASIT    * -----------------X*    MISSNG   *
            .          *****************     file   .      ***************
            .                                       .             .
            .                                       .             .
            .                                       .      "Extension" file
            .                                       .             .
            .                                       .             .
         Partial                                    .             .
         structure     *****************            .             .
          file         *    FSFOUR     *X------     .             .
            .          *****************       .    .             .
            .               .     ^            -----              .
            .               .     .            .                  .
            .               .     .    --------                   .  
            .               .     .   .                           .
            .               .     .   .                           .
            .               v     .   v                           .
            .          *****************                          .
             ---------X*     BNDRY     *X-------------------------   
                       *****************
                            .     ^
                            .     .
                            .     .
                            v     .
                       *****************
                       *    MAPINV     *
                       *****************     




          Path for solvent flattening process, as implemented in the "doall"
        procedure. Starts with SIR, MIR etc phases and includes mask
        generation, solvent flattening and phase combination iterations. The
        leftmost and rightmost branches are optional, for inclusion of partial
        structure information and phase extension, respectively. The FSFOUR-
        BNDRY-MAPINV loop performs the iterations. The PHASIT output
        is fed directly to FSFOUR only during the initial pass, to generate
        the first map. In all passes it is fed to BNDRY to serve as the
        "anchor" phases in the phase combination step. 











                           ************             *************
                           *  FSFOUR  *------------X*  EXTRMAP  *
                           ************             *************
                               ^  ^                       .
                               .  .                       .
                               .  .                       v
          **************       .  .                 *************
          *  PHASIT    *-------   .                 *   MAPAVG  *X----
          **************       .  .                 *************     .
                               .  .                       .           .
                               .  .                       .       "Envelope"
                               .  .                       .          Mask
                               v  .                       v           .
                           ************             *************     .
          "Extension"-----X*  BNDRY   *X------------*   BLDCEL  *X----
             file          ************             *************
                              ^    .
                              .    .
                              .    v
                           ************
                           *  MAPINV  *
                           ************   









          Path followed during solvent flattening iterations modified to 
        include noncrystallographic symmetry averaging. The PHASIT output
        is fed directly to FSFOUR only during the initial pass, to generate
        the first map. In all passes it is fed to BNDRY to serve as the
        "anchor" phases in the phase combination step. The "extension" file
        is optional, and is used for phase extension only. 

        1.04                       FILE FORMATS

          Most of the programs in the PHASES package utilize the same internal
        file formats, choosen for combinations of simplicity and efficiency. 
        The major files used are now described.


        1) "Input" files. Entering data initially into the package assumes one
        can prepare reflection files either in free format, as XENGEN-like
        "MULISTS", or as "SCALEPACK" style files. Thus input structure factor
        files can have any of the following record formats.

          FREE FORMAT i.e.

          h, k, l, F, sig(F)    (ASCII, read in free format)

        The free format input file is generally assumed in the programs if
        the filename ends in ".DAT" or ".dat", and sometimes will be assumed
        if no other file type is deduced from the ending of the filename. The
        "free format" implies that the values in each record are separated by
        at least one blank space or a comma. 

                              or

          XENGEN like "MULIST" i.e.

          h, k, l, res, F, sig(F), F+, sig(F+), F-, sig(F-), iflag

          in format ( 3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 )
          
          The "iflag" status flag is optional. If present, it will be
          used to screen for viable anomalous scattering data. If absent,
          only values with F+ and F- greater than zero will be used when
          anomalous data is needed. MULIST format is generally assummed
          within the programs if the filename ends in ".MU" or ".mu". 

                              or

          SCALEPACK style files i.e. the file starts with a variable
          number of header records, the total number of which is given by
          one plus 2 times the number given in the first header record
          (format I5). After the header records (usually 3) the data
          follows as individual records containing 

          h, k, l, I+, sig(I+), I-, sig(I-)

          in format ( 3I4, 4F8.0 )

          Note that unlike the other formats, for SCALEPACK files
          INTENSITIES and their standard deviations are given instead
          of AMPLITUDES. Also, the files need not contain Bijvoet pair
          data as the last two items may be missing, as would be the
          case if the data were reduced treating Freidel mates as
          equivalent. SCALEPACK format is generally assumed within the
          programs if the filename ends in ".SCA" or ".sca". 

          

        2) "Scaled" (and merged) structure factor files.
        These files are produced by CMBISO or CMBANO, starting with input
        files of type (1). The files are ASCII, with each record containing

           h, k, l, FP, sig(FP), FPH, sig(FPH)

                         or
       
           h, k, l, FP+, sig(FP+), FP-, sig(FP-)

                         or
   
           h, k, l, FP, sig(FP), FPH+, sig(FPH+), FPH-, sig(FPH-)

           in format ( 3I4, 6F10.2)
 
           for either isomorphous, native anomalous or derivative anomalous
           data sets, respectively. SCALED files are generally assumed
           within the programs if the filename ends in ".SCL" or ".scl". 



        3) "Phased" structure factor files. These files are produced mainly
        by PHASIT and BNDRY, but can also be generated by other programs.
        There are two types of "phased" files, depending on whether or not
        probability distributions are available. Both types of files are
        BINARY, but can be converted to ASCII by the utility program RD31.

        The first type, the normal or "long" format has records containing

           h, k, l, FOM*FO, FO, PHIbest, A_B, C_D, MK, FOM
         
        where h, k, l, A_B, C_D and MK are INTEGERS, and the others REALS
        The Hendrickson-Lattman probability distribution coefficients are
        packed two per word, in the A_B and C_D entries according to
          
           A_B = ( IFIX(A*100) + 16384 )*32768 + IFIX(B*100) + 16384

           C_D = ( IFIX(C*100) + 16384 )*32768 + IFIX(D*100) + 16384

        FO is the observed protein structure factor amplitude, PHIbest
        the "best" (centroid) phase in degrees, and FOM the associated figure
        of merit. MK is the restricted phase indicator, such that if MK=1 there
        are no restrictions. If MK > 1, then the reflection is centric, with
        one of the allowed phases given by 15*(MK-1), and the other 180
        degrees away from it.
          There is also an alternate version of the "long format" phase file,
        obtained only from running option 3 of BNDRY with IOTYP=1, which
        has FO and FC replacing FOM*FO and FO in the records. This file type
        is used ONLY if one wants to do solvent flattening and/or NC symmetry
        averaging iterations on DIFFERENCE or 2FO-FC MAPS. Its usage is
        explained elsewhere in the documentation.


        The second type, or "short" format has records containing   

           h, k, l, FO, FC, PHI          

        where FC is the "calculated" structure factor amplitude, as typically
        computed from input coordinates for a model in PHASIT, GREF, or output
        from the map inversion program MAPINV.

        Note that the Fourier program FSFOUR only reads the first six entries
        in a record, so that in general, EITHER type of "phased" file can be
        used for map calculations. However, some map types might be accessible
        with only one of the formats (e.g. difference maps). Both long and
        short format PHASED files are generally recognized within the programs
        if the filename ends in ".31".



        4) "Mask" files. These files are binary, with the same record format
        applying both to "solvent masks" and "averaging masks." The file
        starts with a header record containing
        
             A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
        
        with the first 6 values REAL*4, the next 9 INTEGER*4, the lengths in
        Angstroms and the angles in degrees.
        
        NX = 
              Number of grid points defining one "cell length" along
        NY =  respective axis. Implicitly defines grid spacing as 
              del x = A/NX, del y = B/NY and del z = C/NZ
        NZ =
        
        IXMN, IXMX =  
                     Minimum, maximum grid index defining map region such
        IYMN, IYMX = that  x (fractional) = IX * (del x) / A  etc. 
                     There are no restrictions on magnitudes or signs.
        IZMN, IZMX = 
        
        The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
        each containing one row (IXMX-IXMN+1 BYTE values) along X,
        starting at IXMN. Y is slowest varying, i.e. the file could have
        been created with the following FORTRAN code:
        
               DO 30 IY=IYMN,IYMX
                       DO 20 IZ=IZMN,IZMX
        20     WRITE(LU)(MSK(IX,IY,IZ),IX=IXMN,IXMX)
        30     CONTINUE
        
        Note that the mask entries are FORTRAN type BYTE (INTEGER*1).  
        For solvent masks, the entries will either be 0 (protein) or 2
        (solvent). For averaging masks only the values 0, 10, 20, 30, 40
        etc are meaningful as they indicate the grid point is inside the
        primary envelope for molecules 1, 2, etc. The masks can be displayed
        with program MAPVIEW, and program RDHEAD can be used to list the
        header record. 



        5) "FSFOUR" maps. These maps are produced by FSFOUR (and BLDCEL).
        They are binary, and contain a variable number of header records
        followed by the map. The map ALWAYS covers one full cell. See FSFOUR
        write-up (and possibly examine the program source) for further
        details.



        6) "Submaps" Also referred to as "averaging" maps. These map files
        are binary, with the same header and record structure as "mask" files,
        except that the density values are written as FORTRAN type REAL instead
        of mask values. They are usually prepared by MAPVIEW or EXTRMAP, but
        can be generated by MAPORTH, SKEW, TRNMSK etc. Note that RDHEAD can
        also be used to list the header record.



        7) "Extension" file. Used for phase extension, and created by program
        MISSNG. This file is ASCII, and contains a list of reflection indices,
        Fobs and phase probability distribution coefficients, for reflections
        absent on the main "phased" file, but for which native amplitudes and
        possibly phase probability distribution coefficients are available.
        It is used only for phase extension. The records simply contain

        h, k, l, Fobs, A_B, C_D

        in format ( 3I4, F10.2, 2I12 ) where the distribution coefficients
        are packed as in a normal phased file. If no distribution coefficients
        are available the A_B and C_D values are zero.

        3.00                           EXAMPLES

          This section contains samples of input for various programs and
        procedures. In general, template files containing these examples
        are also provided along with the programs on the distribution media.
        In some cases (for example, solvent levelling), some practical
        considerations are also discussed.



   
        3.01           *****  SAMPLE INPUT PARAMETER FILE *****
        
        
        LOGFILE=seb.log              
        LATTICE=P
        45.33 68.33 79.62 90. 90. 90.
        4
        X,Y,Z
        1/2-X,-Y,1/2+Z
        1/2+X,1/2-Y,-Z
        -X,1/2+Y,1/2-Z

            
        3.02          *****  SAMPLE INPUT DECKS FOR PHASIT  *****
        
        EXAMPLE I   
                                              
        The deck below will compute SIR phases from a single isomorphous
        replacement derivative data set.  The resulting phase file can
        then be used in the procedure DOALL to carry out Wang's ISIR
        process, or in MRGDF or MRGBDF to solve new derivatives or look
        for additional sites. The "difference coefficients" file can
        be used to compute "observed" and "calculated" difference
        Pattersons, heavy atom difference maps or heavy atom "double
        difference" maps to find new sites.
        
        The following data is assumed to be in a file called    phasit.d
        
        seb.pam
        0 0
        1 0 1
        phasit.31
        DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
        monopt.scl
        pt_iso_diff.31
        4. 6. 0 1. 0. 0. 0. 0. 0. 0. 
        3
               PT1     0.2539    0.1918    0.1376    35.       1.000         6
               PT2     0.1754    0.0439    0.4578    20.       0.664         6
               PT3     0.5474    0.0523    0.6964    50.       0.450         6
        
        
        
        EXAMPLE II
        
        The deck below can be used to compute SIRAS phases from isomorphous and
        anomalous scattering data from a single derivative. The resulting file 
        can then be used directly for map computation; used in the procedure 
        DOALL to carry out solvent flattening/negative density truncation, 
        phase extension etc, starting with (and tying to) the SIRAS phases; or 
        in MRGDF or MRGBDF to solve new derivatives or look for additional 
        sites. The "difference coefficients" files can be used to compute
        "observed" and "calculated" difference Pattersons, heavy atom 
        difference maps or heavy atom "double difference" maps to find new
        sites.
        
        The following data is assumed to be in a file called    phasit.d
        
        seb.pam
        0 0
        2 0 1
        phasit.31
        DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
        monopt.scl
        pt_iso_diff.31
        4. 6. 0 1. 0. 0. 0. 0. 0. 0.
        3
               PT1     0.2539    0.1918    0.1376    35.       1.000         6
               PT2     0.1754    0.0439    0.4578    20.       0.664         6
               PT3     0.5474    0.0523    0.6964    50.       0.450         6
        DIAMINO DICHLORO PT (DERIVATIVE ANOMALOUS DISPERSION DATA )
        monoptano.scl
        pt_ano_diff.31
        4. 6. 2 1. 0. 0. 0. 0. 0. 0.
        3
               PT1     0.2539    0.1918    0.1376    35.       1.000         6
               PT2     0.1754    0.0439    0.4578    20.       0.664         6
               PT3     0.5474    0.0523    0.6964    50.       0.450         6
        
        
        
        EXAMPLE III
        
        The deck below assumes isomorphous replacement data is available for
        two derivatives, and 5 passes of phase refinement, each consisting
        of 3 cycles for each derivative will be done to refine nearly all 
        possible derivative parameters (except B's), i.e. MIR phases will be 
        computed and refined. The resulting file can then be used directly for 
        map computation; used in the procedure DOALL to carry out solvent 
        flattening/negative density truncation, phase extension etc, starting 
        with (and tying to) the MIR phases; or in MRGDF or MRGBDF to solve new 
        derivatives or look for additional sites. The "difference 
        coefficients" file can be used to compute "observed" and "calculated"
        difference Pattersons, heavy atom difference maps or heavy atom
        "double difference" maps to find new sites.
        
        The following data is assumed to be in a file called    phasit.d
        
        seb.pam
        0 0
        2 1 1    
        phasit.31
        DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
        monopt.scl
        pt_iso_diff.31
        4. 6. 0 1. 0. 0. 0. 0. 0. 0.
        3
               PT1     0.2539    0.1918    0.1376    35.       1.000         6
               PT2     0.1754    0.0439    0.4578    20.       0.664         6
               PT3     0.5474    0.0523    0.6964    50.       0.450         6
        HGCL2 ( ISOMORPHOUS REPLACEMENT DATA )
        monohg.scl
        hg_iso_diff.31
        4. 6. 0 1. 0. 0. 0. 0. 0. 0.
        2
               HG1     0.3639    0.2218    0.1776    20.       1.000         7
               HG2     0.4454    0.0939    0.2878    20.       0.800         7
        5 0.2 6 2 1 0 1
        1 SET 1
        0 0 0 0 0
        0 0 0 0 0
        0 0 0 0 0
        1 1 1
        1 SET 1  
        1 1 1 0 0
        1 1 1 1 0
        1 1 1 1 0
        1 1 1 
        1 SET 1
        1 1 1 0 0
        1 1 1 1 0
        1 1 1 1 0
        1 1 1 
        2 SET 2
        0 0 0 0 0
        0 0 0 0 0
        1 1 1
        2 SET 2
        1 1 1 0 0
        1 1 1 1 0
        1 1 1 
        2 SET 2
        1 1 1 0 0
        1 1 1 1 0
        1 1 1 
        
        
        
        EXAMPLE IV
        
        Similar to example III, except that one of the temperature factors is
        converted to anisotropic and is also refined, with the isotropic 
        equivalent restrained to its original value. 
        
        The following data is assumed to be in a file called    phasit.d
        
        seb.pam
        0 0
        2 1 1
        phasit.311
        DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
        monopt.scl
        pt_iso_diff.31
        4. 6. 0 1. 0. 0. 0. 0. 0. 0.
        3
               PT1     0.2539    0.1918    0.1376    35.       1.000         6
               PT2     0.1754    0.0439    0.4578    -20.      0.664         6
        0.        0.        0.        0.        0.        0.        20.       
        0.5
               PT3     0.5474    0.0523    0.6964    50.       0.450         6
        HGCL2 ( ISOMORPHOUS REPLACEMENT DATA )
        monohg.scl
        hg_iso_diff.31
        4. 6. 0 1. 0. 0. 0. 0. 0. 0.
        2
               HG1     0.3639    0.2218    0.1776    20.       1.000         7
               HG2     0.4454    0.0939    0.2878    20.       0.800         7
        5 0.2 6 2 1 0 1
        1 SET 1
        0 0 0 0 0
        0 0 0 0 0 0 0 0 0 0
        0 0 0 0 0
        1 1 1
        1 SET 1
        1 1 1 0 0
        1 1 1 1 1 1 1 1 1 1
        1 1 1 1 0
        1 1 1 
        1 SET 1
        1 1 1 0 0
        1 1 1 1 1 1 1 1 1 1
        1 1 1 1 0
        1 1 1 
        2 SET 2
        0 0 0 0 0
        0 0 0 0 0
        1 1 1
        2 SET 2
        1 1 1 0 0
        1 1 1 1 0
        1 1 1 
        2 SET 2
        1 1 1 0 0
        1 1 1 1 0
        1 1 1 
        
        
        For all examples, PHASIT can be run with the following control 
        information.
        
        For UNIX, use the following in a shell script
        
        phasit < phasit.d > phasit.l
        
                

        3.03  *****  SAMPLE INPUTS FOR PHASING BY SOLVENT LEVELING  *****
        
           A complete solvent flattening run can be executed by creating a few 
        small data files, and running the procedure DOALL. This will carry out 
        the complete sequence of protein-solvent boundary determination, 
        solvent flattening, and phase combination steps, in a manner 
        equivalent to that suggested by Wang in his ISIR process, although the 
        initial phases can be SIR, SAS, MIR, MIRAS or any combination 
        generated by PHASIT. It will generate an initial solvent mask, use it 
        for 4 cycles of solvent flattening/ phase combination, create a new 
        mask, use it for 4 cycles, create a third mask, use it for 8 cycles, 
        and, if desired, do additional phase extension cycles, and then
        possibly phase AND AMPLITUDE extension cycles. 
        
           A series of files, all given .d extensions (UNIX) or .dat 
        extensions (VMS) should be created containing control information for 
        the forward and inverse Fourier transforms, for each option of the 
        BNDRY program and for RMHEAVY. In general, these are the only files
        which will have to be changed for a new application, PROVIDED THE FILE
        NAME CONVENTION IN THE CONTROL FILES IS ADHERED TO. The output from
        PHASIT should be called phasit.31 and if phase extension is to be done,
        then the output from MISSNG should be called extrfl.d and the file
        sloext.d should also be prepared. If phase extension is not desired,
        then one does not have to run MISSNG and create sloext.d, but
        the line invoking the "extnd" procedure (@EXTND.COM in DOALL.COM for
        VMS systems or sh extnd.sh in doall.sh for UNIX systems) should be
        commented out. The individual program writeups should be consulted
        for the meaning of the parameters. It is important that the grid
        spacing selected in the input to FSFOUR be appropriate for the highest
        resolution data to be used anywhere in the process, including phase
        extended reflections. A grid spacing of about 1/3 of the smallest d 
        spacing is recommended. It is also VERY important that the index range
        requested in the inputs to MAPINV cover at least a complete asymmetric
        unit out to the maximum resolution to be used anywhere in the process,
        including phase extended reflections. The particular asymmetric unit
        covered need not be identical to that originally input implicitly to
        PHASIT via the reflection files, but all reflections in the input
        files should at least have symmetry related counterparts in the MAPINV
        asymmetric unit. Since index limits in MAPINV are restricted to
        minimum and maximum values along each reciprocal axis, in high
        symmetry systems it may be necessary to cover more than an asymmetric
        unit (this causes no problem).

        Note also that MAPINV can compute structure factors only in the
        hemisphere with L non-negative, thus one MUST request an asymmetric
        unit in this hemisphere. This also creates no problem SINCE ANY
        REFLECTION CAN ALWAYS BE RELATED TO ONE IN THIS HEMISPHERE BY
        application of the Friedel symmetry operator, and this is
        automatically done in the programs. THUS WHEN IN DOUBT, ONE CAN
        ALWAYS SPECIFY A FULL HEMISPHERE, I.E. A RANGE OF -HMAX,HMAX,
        -KMAX,KMAX AND 0,LMAX WHICH WILL WORK, but may not be the most
        efficient way of doing things. For this reason one will NEVER have
        to reindex the input data, as an appropriate range in MAPINV can
        ALWAYS BE GIVEN! 
        
           Example inputs are now given. If the supplied doall and related 
        scripts are to be used without modification, then the filenames in 
        these samples should NOT be changed (except for the parameter file, of 
        course). One need change only the parameter file, solvent content and 
        resolution related parameters, the map periods and index range, and
        the heavy atom coordinate file. 
        
        
        ---- file fft.d (input to FSFOUR, for map calculation)----------------
        
         seb.pam
         COMPUTE ELECTRON DENSITY MAP
         0 48 72 80 1 0 20 0 0 0 0.
         four.ref
         four.map
                
        
         --- file minv1.d (input to MAPINV, for solvent boundary determination)
        
         seb.pam
         INVERT ELECTRON DENSITY MAP AFTER TRUNCATING NEGATIVES
         four.map
         minv.ref 
         0 0 0 16 0 24 27
         0. 0. 1 0 
        
                 
         --- file minv2.d (input to MAPINV, for normal map inversion) ----- 
        
         seb.pam
         INVERT ELECTRON DENSITY MAP AFTER SOLVENT FLATTENING
         mod.map
         minv.ref
         0 0 0 16 0 24 27
         0. 0. 0 0 
        

         --- file rmhv.d (input to RMHEAVY, for removal of heavy atoms ) ----
        
         seb.pam
         four.map
         nohv.map
         2 2.5
               PT1     0.2539    0.1918    0.1376    35.       1.000         6
               PT2     0.1754    0.0439    0.4578    20.       0.664         6

        
         --- file bnd0.d (input to BNDRY, option 0, prepare SF for protein- 
                          solvent boundary determination )------------------
        
         seb.pam
         0  
         9.
         minv.ref
         four.ref
        

         --- file bnd1.d (input to BNDRY, option 1, create solvent mask)----
        
         seb.pam
         1
         four.map
         mask.map
         .4
                 
        
         --- file bnd2.d (input to BNDRY, option 2, do solvent flattening and 
                                  negative density truncation) ----------------
         
         seb.pam
         2
         four.map
         mask.map
         mod.map
         .086
        
         
         --- file bnd3.d (input to BNDRY, option 3, combine new phases with 
                                   original) --------
         
         seb.pam
         3
         0 0. 1. 0 0
         phasit.31
         minv.ref
         newphi.ref
        
        
         --- file extnd.d (input to BNDRY, combine new phases with original, 
                                    including phase extension ) ------------
        
         seb.pam
         3  
         1 3.5 1. 0 0
         phasit.31
         minv.ref
         extrfl.d
         newphi.ref
        
        
        --- file extnda.d (input to BNDRY, combine new phases with original, 
                            including phase AND AMPLITUDE extension) --------
        
         seb.pam
         3  
         2 3.5 1. 0 0
         phasit.31
         minv.ref
         extrfl.d
         newphi.ref
      
 
       --- file sloext.d (controls range and rate of phase extension) --

         seb.pam
         4. 3.5 8
         extnd.d

 
       --- file sloext2.d (controls range and rate of phase AND AMPLITUDE
                                     extension --------

         seb.pam
         4. 3.5 8 
         extnda.d


           Once the input is prepared, the phasing process can be carried out 
        either by running a series of command procedures as individual steps, 
        or by running a single command procedure which invokes all others.  
        The single procedure, called doall.sh or doall.com follows. In the 
        procedures that follow, it is assumed that phase extension will be 
        carried out, and that the additional files "extrfl.d" (prepared by
        MISSNG) and "sloext.d" (see SLOEXT write-up) are available.

        3.04  For UNIX, use the following commands in a shell script,
              called doall.sh
        
        # COMMAND PROCEDURE TO CARRY OUT THE ENTIRE CYCLING PROCESS FOR PHASING
        # DATA BY SOLVENT LEVELLING
        #
        # COMPUTE THE FIRST SOLVENT MASK
        sh mask1.sh
        #
        # COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING FIRST MASK)
        sh cycle4.sh
        #
        # COMPUTE THE SECOND SOLVENT MASK
        sh mask2.sh
        #
        # COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING SECOND MASK)
        sh cycle8.sh
        #
        # COMPUTE THE THIRD SOLVENT MASK
        sh mask3.sh
        #
        # COMPUTE 8 CYCLES OF SOLVENT LEVELLING (USING THE THIRD MASK)
        sh cycle16.sh
        #
        # DO ADDITIONAL CYCLES OF SLOW PHASE EXTENSION (TO REFLECTIONS WITH
        # NATIVE AMPLITUDES SUPPLIED), EITHER TO HIGHER RESOLUTION OR TO
        # INITIAL RESOLUTION
        sh extnd.sh
        #
        # IF DESIRED, DO ADDITIONAL CYCLES OF PHASE EXTENSION (INCLUDING 
        # DATA FOR WHICH THERE IS NO AMPLITUDE INFORMATION). THIS OPTION IS
        # NOT ALWAYS DESIRABLE, THUS IT IS COMMENTED OUT.  TO INVOKE IT, SIMPLY
        # REMOVE THE # FROM THE FOLLOWING LINE
        #sh extnda.sh
        #
        # THATS ALL


        3.05                      EXPECTED OUTPUT FILES
        
           Execution of the "doall" procedure will result in the following 
        files being present. (phasit.31 and phasit.log should be present prior 
        to running "doall.")
        
         phasit.31    contains original MIR, SIR etc phases from PHASIT
        
         phasit.l     contains phasit printed output
        
         mask1.14     contains first solvent mask
        
         mask1.l      contains mask1 printed output
        
         phi4cy.31    contains phases after 4 cycles using first mask
        
         cycle4.l     contains printed output from first 4 cycles
        
         mask2.14     contains second solvent mask
        
         mask2.l      contains mask2 printed output
        
         phi8cy.31    contains phases after 4 cycles using second mask
        
         cycle8.31    contains printed output from next 4 cycles
        
         mask3.14     contains third solvent mask
        
         mask3.l      contains mask3 printed output
        
         phi16cy.31   contains phases after 8 cycles using third mask
        
         cycle16.l    contains printed output from next 8 cycles
        
         phiextnd.31  (if generated) contains phases after 8 cycles using 
                      third mask, plus additional cycles of phase extension
                      to known amplitudes.
        
         extnd.l      (if generated) contains printed output from next 12 
                      cycles
        
         phiextnda.31 (if generated), contains phases after 8 cycles using 
                      third mask, plus additional cycles of phase extension
                      to known amplitudes, plus additional cycles of phase
                      and amplitude extension.
        
         extnda.l     (if generated), contains printed output from next 12 
                      cycles.
                      cycles.

        4.00     NATIVE, DIFFERENCE AND "CALCULATED" PATTERSON MAPS
               
           In protein crystallography one is generally interested in difference
        Patterson maps to locate heavy atoms, in which the Fourier coefficients
        are the squares of the DIFFERENCE in AMPLITUDES between native and
        derivative data, or between members of a Bijvoet pair. Sometimes
        however, it is useful to compute native Patterson maps, or to compute
        "calculated" Patterson maps (generated from intensities computed
        explicitly from an input atomic model). The native maps may provide
        information about non-crystallographic symmetry, while the "calculated"
        maps obtained from a tentative heavy atom structure can be compared
        with the observed difference Pattersons to see how well the major
        features are being explained. The latter method is particularly
        useful in high symmetry systems, where even a small number of heavy
        atom sites gives rise to many Patterson peaks. Examining the observed
        and calculated Pattersons side by side (perhaps in VIEWPLT) can then
        provide confidence in the heavy atom interpretation. 

        DIFFERENCE PATTERSONS - Difference Pattersons (either isomorphous or
        anomalous) can be computed by two different routes in PHASES. The
        first approach is to generate a standard "phased" file containing
        h,k,l,Fo,Fc,Phi, and use it in FSFOUR with the MAPTYP=5 option.
        Generally programs CMBISO or CMBANO do the initial data preparation,
        and their output files are then fed to TOPDEL to select the data
        according to various criteria, screen for and reject outliers,
        and write the appropriate information to the output file for FSFOUR.
        The output file will then contain either FPH and FP, or F+ and F- in
        the amplitude slots, depending on whether isomorphous or anomalous
        data were input. The second approach, which is useful only after
        at least one site is found in the derivative, is to use the  
        "difference coefficient" file output from PHASIT in FSFOUR with
        MAPTYP=6. In the isomorphous case if the input site(s) are correct
        this should lead to a cleaner map, since the FPH to FP scale factor
        has been refined, and also because the angular difference between
        the FP and FPH vectors are compensated for. The FO and FC slots in
        the file then contain (FPH-FP)obs,corrected and FHcal, respectively.
        For anomalous data these slots contain (FPH+ - FPH-)obs and
        (FPH+ - FPH-)calc or their counterparts for native anomalous data.      

        NATIVE PATTERSONS -  Native Patterson maps can be generated in several
        ways, depending on what information is currently available. In all
        cases one must prepare a standard input "phased" file containing 
        h,k,l,Fo,Fc,Phi, and request the appropriate option in FSFOUR to create
        the desired coefficients from the input data. One way to do this is
        to run CMBISO inputting the native file twice (as both the native and
        derivative data sets), and then run TOPDEL selecting ALL coefficients
        to be output (you can still use d and F/sigma cutoffs, but output
        100% of the data!). The R factor and all differences will of course,
        be zero, but the output file will contain native amplitudes in both
        the Fo and Fc slots, and thus the native Patterson can be generated
        by requesting MAPTYP=6 in FSFOUR. Another approach would be to run
        PHASIT, SF mode with IHLCF=0 and ISIGA=0, using a single "dummy" atom
        arbitrarily positioned as the model. The output file will then give a
        bad R factor, but it will contain Fo and Fc in the amplitude slots,
        and selecting MAPTYP=6 in FSFOUR will again give the desired native
        Patterson. The first method allows one to use d spacing and F/sigma
        cutoffs, while the second always uses all of the data. In either case
        the native Pattersons can be searched for peaks, contoured, displayed,
        printed etc. with PSRCH, MAPVIEW, CTOUR, MKPOST etc.

        "CALCULATED PATTERSONS" - Patterson maps corresponding to an input
        atomic model are also generated by preparing the normal "phased" file
        containing h,k,l,Fo,Fc,Phi, and by selecting the appropriate 
        coefficient option (MAPTYP=7) in FSFOUR. In this case it is important
        that the second amplitude slot truely contains Fc. One way to do this
        is to run PHASIT, SF mode with IHLCF=0 and ISIGA=0, and to include
        all of the desired atoms in the model. If a heavy atom model is
        used as the input, the R factor will be meaningless (since scaling
        is to the NATIVE amplitudes rather than differences), but the output
        file would still be appropriate for the "calculated" difference 
        Patterson as the map scale is arbitrary anyway. Another way is to use
        GREF to prepare the file, by requesting that an output Fourier file
        be written. In that case the file created can contain the proper
        DIFFERENCE amplitude in the Fo slot, and the model based Fc in the
        Fc slot. Then the SAME file could be used in FSFOUR to create both
        the observed difference Patterson (MAPTYP=6) and the modeled version
        of it based on the heavy atoms (MAPTYP=7). Once again, the FSFOUR map
        can be searched for peaks, contoured etc. as any normal map. Finally,
        the "difference coefficients" file written by PHASIT can be used in
        FSFOUR with MAPTYP=7 to compute the "calculated" difference Patterson
        based on the input heavy atom model. The advantage of doing it this
        way is that the model, and hence the FC's, then would reflect all 
        refined scaling parameters (possibly including anisotropic B's), and
        also models based solely on anomalous scatterers could be used.


        5.00                REFINING HEAVY ATOM PARAMETERS 

           There are two general ways to refine heavy atom parameters within
        the PHASES package: refinement against isomorphous or anomalous
        amplitude differences; or "phase refinement", i.e. by minimizing lack
        of closure. Isomorphous/anomalous difference refinement is carried
        out with the program GREF, and has the advantage that only data from
        the crystal being refined is used (and the native, in the isomorphous
        case). It is therefore independent of all other derivatives, and is
        particularly useful in the case of common sites between multiple
        derivatives since there can be no "cross talk" or bias. Also, if this 
        refinement is carried out against centric data only, there are few
        assumptions made about the protein phase and the refinement is 
        usually very reliable. It is nearly always used for the first 
        derivative as no reliable protein phase estimates are available at
        that time, but it's not a bad idea to do this initially for each
        derivative. The disadvantage is that refinement of all parameters with
        centric data may not be possible in some space groups. For example,
        in P2 the only centric data available are of the type h0l, thus one
        can not refine ANY y coordinates. In GREF one can have the program
        automatically include the 25% strongest differences for acentric data
        along with the centric data to enable refinement of SOME y's, but then
        one is introducing assumptions about the protein phase which are only
        approximately valid, thus weakining the refinement. Also, in a space
        group like P2 the origin is not fixed in the y direction, so even the
        RELATIVE y coordinates BETWEEN DERIVATIVES can not be refined, even
        when acentric data ARE included. For refinement in GREF one would
        generally start by assigning the major site an occupancy of 1.0, other
        sites appropriate occupancies and all heavy atoms B values of 15 or
        20. Then do a single cycle refining only the scale factor (which you
        can initially assign any positive value, usually 0.1). After an
        estimate of the scale factor is obtained, one can refine coordinates
        as appropriate, along with the scale factor. One can then refine
        coordinates, scale factor and occupancies simultaneously, but the
        occupancy of the MAJOR SITE should ALWAYS be held fixed at 1.0. Also,
        when polar axes are present even with sufficient data available for
        refinement (e.g. acentrics included) the coordinates of the MAJOR SITE
        along POLAR DIRECTIONS should still NOT be refined, as they are needed
        to fix the origin in the polar directions. Finally, if the resolution
        is sufficiently high one can then include B values in the refinement,
        but if there are indications of instability the B's can usually be 
        held at their initial values without introducing much error. Attempts
        to simultaneously refine parameters which are not independent (i.e. 
        coordinates for ALL atoms along polar directions, both scale factor
        AND occupancy when only one atom is input, ALL coordinates of an atom
        on a special position in the space group) or to refine parameters when
        there is no data determining that parameter (coordinates of ANY atom
        along polar direction when using ONLY centric data) will result in
        a singular matrix being obtained, and an aborted refinement.
        Paramaters obtained from refinement against amplitude differences are
        generally well suited to initiate subsequent phasing calculations or
        further "phase refinement" in program PHASIT.
          "Phase refinement" is carried out with the program PHASIT, and is
        ideally suited for refinement with multiple derivatives/data sets
        although it can also be used with a single derivative. In PHASIT
        either conventional refinement, or "maximum likelihood" options may
        be selected, and if only one derivative is used the program will
        automatically switch to maximum likelihood mode. Phase refinement
        requires an estimate of the protein phase, which is why it's better
        suited for the multiple derivative case, since SIR or SAS estimates
        alone are usually very poor. The advantages of phase refinement are
        that in general, all parameters may be refined including native to
        derivative scaling parameters, and the corresponding weights (expected
        lack of closure estimates) are also implicitly "refined". Since the
        origin is fixed by the protein phase estimates, refinement is possible
        for coordinates along polar directions, and the origin can thus be
        properly established between derivatives. Phase refinement is however,
        sensitive to the hand of the heavy atom sets, and it is assumed that
        all input sets correspond to the SAME origin and hand. A useful
        procedure is to initially start with all parameters as described
        above, (after correlating origins and hand between derivatives with
        cross difference Fouriers) and then refine only the FH scale factor
        for one cycle. Then refine the FH scale factor along with coordinates,
        then along with coordinates and occupancies (again always holding the
        MAJOR site occupancy to 1.0 in each derivative). Then one can include
        the FPH scale factor along with the other parameters, and finally
        include the B factors. When refining anomalous scattering data sets
        one would generally do the same, except that both the FPH and FH scale
        factors should NOT be refined simultaneously (they can be alternated),
        and for NATIVE anomalous scattering the FPH scale factor should NEVER
        be refined!
          A useful option is to use protein phase estimates obtained from an
        external source during phase refinement rather than calculated from
        the current heavy atom parameters and data. Thus one could refine
        initially as described, then modify the phases by solvent flattening 
        and/or NC symmetry averaging, and then refine the parameters again
        this time against the modified phases. The new parameters are then
        used to compute phases to start another round of density modification.
        This procedure has been helpful in several cases, and usually is
        particularly good for refining the FPH scale factor. For conventional
        phase refinement in PHASIT one would select a figure of merit cutoff
        in the range 0.4 to 0.6 and use weights of 1/E**2. For maximum
        likelihood mode one would select a figure of merit cutoff in the range
        0.1 to 0.2 and use unit weights. Most successful protein structure
        determinations have utilized phase refinement to obtain the final
        MIR type phases, although refinement against differences is often
        done first to obtain starting values for the parameters.
 
        6.00         HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE AND
                           CROSS DIFFERENCE FOURIER MAPS        

           There are several ways to compute heavy atom based difference
        or cross difference type Fourier maps within the PHASES package.

        1) HEAVY ATOM DIFFERENCE or DOUBLE DIFFERENCE FOURIERS.
        The first approach is to refine heavy atom parameters against
        isomorphous or anomalous difference AMPLITUDES in program GREF,
        and request that a Fourier file be written. If this file is
        used in FSFOUR with MAPTYP=1, then the observed difference
        Fourier, i.e. that which should reveal all heavy atom sites
        can be obtained. If the file is used with the MAPTYP=3 option,
        then a "double difference" map is computed, i.e. the heavy
        atoms included in the structure factor calculation are subtracted
        out, so that the map should show only additional sites. The
        limitations with this approach are that the "observed" amplitudes
        ABS(FPH-FP) or ABS(F+ - F-) are approximations, since vector
        differences rather than amplitude differences should be used,
        and that the heavy atom model may be crude since the FPH to FP
        scale factor has not been refined, and anisotropic thermal
        parameters for the heavy atoms can not be used. Also, if used
        with anomalous data the absolute configuration can not be 
        obtained since absolute values of delta F were used.
          The second approach is to compute phases and/or refine heavy
        atom parameters in program PHASIT, and use the "difference
        coefficients" files it produces in FSFOUR with the MAPTYP=1
        or MAPTYP=3 options. If MAPTYP=1 the "observed" difference
        Fourier showing all heavy atoms will be obtained, however the
        results should be improvements over those obtained with the
        previous method. This results from the fact that the FPH to
        FP scaling parameters can be refined, the heavy atom thermal
        factors may be refined anisotropically, and phase difference
        information is used to correct the "observed" amplitudes to
        account for the fact that the two vectors are not colinear.
        In this case for isomorphous data sets the corrected
        "observed" differences, calculated heavy atom amplitudes and
        calculated heavy atom phases are used to compute the map.
        For anomalous data the "observed" and "calculated" Bijvoet
        differences are used along with the protein phases shifted
        by 90 degrees, to give true "Bijvoet difference" or "Bijvoet
        double difference" maps, so that the absolute configuration
        is preserved. Again, MAPTYP=1 should show all anomalous
        scatterers while MAPTYP=3 should have those included in the
        model subtracted out. These methods use phase information
        computed only from the heavy atoms or anomalous scatterers,
        although in the anomalous case all such information is
        combined to estimate protein phases.
          A third approach is to combine observed AMPLITUDE 
        differences [ i.e. (FPH-FP) or (F+ - F-) ] directly with
        estimates of the protein phases to compute difference
        or Bijvoet difference Fouriers. One would then generate
        protein phases either by MIR, SIR, BNDRY, or from a model in
        PHASIT, and combine the phases with observed amplitude
        differences in programs MRGDF or MRGBDF for isomorphous or
        anomalous data, respectively. The maps would then be computed
        in FSFOUR using the MAPTYP=3 or MAPTYP=8 options for
        difference or Bijvoet difference maps, respectively. The
        advantage of this approach is that the protein phases
        themselves may be better, since one can use solvent flattened
        and/or NC symmetry averaged phases in the synthesis. For
        the isomorphous case the output coefficients file would then
        contain indices, FPH, FP, PHI_pro, and for the anomalous
        case the file would contain indices, F+, F-, PHI_pro. A
        disadvantage is that one can not "subtract out" the heavy
        atoms used in the phasing, so that they will also appear
        in the maps possibly making it more difficult to detect
        minor sites. 

        2) CROSS DIFFERENCE FOURIERS.
        This is accomplished similarly to the third option above,
        except that in MRGDF or MRGBDF a data file corresponding to
        a new derivative, i.e. one which was never used in phasing,
        is merged with an existing protein phase file. The cross
        difference Fourier (or cross Bijvoet difference Fourier)
        is then obtained in FSFOUR with MAPTYP=3 or MAPTYP=8,
        respectively. These maps should show all heavy atom or
        anomalous scatterer sites in the new derivative, which can
        then be checked against the appropriate difference 
        Patterson. The advantage of doing this, in addition to
        helping solve the new derivative, is to assure that heavy
        atom sites in the new derivative correspond to the same
        origin and hand as those used in the original phasing.         
 

        7.00               CREATING/EDITING SOLVENT MASKS 

           In most cases adequate solvent masks are prepared as part of the
        "doall" procedure, which carries out a reciprocal space equivalent
        of the automated protein-solvent boundary determination method
        described by Wang with the added modification that density in the
        immediate vicinity of heavy atoms is ignored during mask construction.
        Solvent masks however, can also be created by hand, from coordinates
        for an input model or by starting with any of these masks and editing
        them. Solvent masks MUST have a one-to-one correspondence with FSFOUR
        maps, and thus they also MUST cover one full cell on the same grid
        used for the map, and be oriented as xz sections. They also must have
        the structure as described in the "file formats" section. This happens
        automatically if the masks are constructed by the "doall" procedure,
        but care must be taken to insure these features if the masks are
        created by other means. Pre-existing solvent masks can be examined
        and/or edited in MAPVIEW, or MAPVIEW can be used to create the masks
        "from scratch" by hand tracing boundaries in contoured maps. Several
        options are now described.

        *** Examining/editing "normal" (i.e. full cell) solvent masks ***

           These masks (named mask1.14, mask2.14 and mask3.14, if created by
        the "doall" procedure) can be examined in MAPVIEW or MAPVIEW_X by
        inputting any FSFOUR map with the same grid, specifying that masks
        will be used, selecting 0 to 0.999 for each of the x, y and z ranges,
        specifying the xz section orientation and "recovering" the 
        pre-existing mask file. From the menu contoured sections can then be
        selected and displayed. Clicking the mouse with the cursor in the
        "show mask" menu area will then display the solvent mask as blue dots
        on the solvent grid points. One can then use the menu options to
        scroll through the sections, displaying both contoured density and
        the solvent mask. One could also use the "trace mask" menu option 
        as described in the MAPVIEW writeup to edit the mask with the mouse,
        but at this point it is not desirable to do this as the full cell
        map is displayed, and one may have to make identical edits in each
        symmetry related envelope. If this is not done very carefully one
        could easily destroy the space group symmetry in the mask. A better
        approach, if editing is to be done, is simply to examine the map and 
        mask to determine the coordinate range which would carve out only one
        contiguous molecule (asymmetric unit) by following the fractional
        coordinates as the cursor moves across the screen (displayed in the
        lower right hand corner). Note that when determining the range one can
        cross into neighboring cells, although only the one-cell-translated
        map region is displayed. Once an appropriate range is deduced, write
        it down and exit MAPVIEW without saving any files. Then run EXTRMAP
        and EXTRMSK to extract that same range from the FSFOUR map and solvent
        mask, respectively, to create the corresponding "submaps". This
        allows one to deal only with a contiguous asymmetric unit, and to
        select regions spanning cell edges. Now run MAPVIEW again this time
        inputting the non-fsfour (i.e. submap) and its corresponding mask
        file. Editing can then be done on the submask. After editing all
        appropriate sections, use the "MAKEASU" menu option to symmetrize the
        submask, and scroll through the masks again to confirm that everything
        is as desired. Once you are happy with it, exit MAPVIEW and when
        prompted, request that the entire submask region be saved to a file.  
        At this point you have the edited mask covering an asymmetric unit.
        Run BLDCEL inputting the submap, edited submask and original FSFOUR
        map to expand the submask (and submap) to a full cell. You can delete
        the output map file, but the output mask file now corresponds to the
        edited solvent mask, expanded to a full cell obeying space group
        symmetry. It can now be used for solvent flattening (or examined 
        again in MAPVIEW just as the original mask was to confirm the
        expansion).


                 ***** Creating solvent masks from a model ***** 

          If atomic coordinates are available from a tentative model, these 
        coordinates can be used to create a solvent mask. To do this one 
        should first prepare a PHASES style file containing the atomic
        coordinates (possibly from a PDB file via PDB_CDS), and determine the
        range (in fractional coordinates) which encompasses the model atoms.
        Then enlarge the range (on each end) slightly to account for the
        radius to be assigned to each atom. MDLMSK can then be run to create
        a mask file just encompassing the molecule. When prompted in MDLMSK,
        the periods (number of grid points along each axis) should be
        specified EXACTLY as in the input to FSFOUR, to insure that the maps
        to be computed later will have the same grid as the mask. The adjusted
        fractional coordinate range for the model should then be specified
        along with a mask number (use 1 for pure solvent masks), and a radius
        of about 1.8 angstroms. In the mask the outer boundary will be 
        appropriate, but there will typically be many small holes in the
        interior caused by use of a Van der Waal's size radius. Use of a
        larger radius could avoid these holes, but would artificially extend
        the outer boundary. To avoid this one generally uses the smaller
        radius, and then edits the masks to preserve the outer boundary but
        fill in the interior holes. This can be done very quickly in MAPVIEW.
        To do this run MAPVIEW inputting a FSFOUR map (any one will do, as
        long as the periods are the same as that used in MDLMSK) and request
        that masks will be used. Then input the same coordinate range as in
        MDLMSK, request the xz section orientation and "recover" the mask
        file from MDLMSK. You can effectively turn off the density display
        by selecting a high contour level, and scroll through the sections
        editing each via the "show mask" and "trace mask" options described in
        the MAPVIEW writeup. Just quickly trace around the already displayed
        outer boundary to preserve it, and the interior holes will be filled 
        automatically when you are done with each section. When finished, use
        the "MAKASU" option to symmetrize the mask region. Then exit MAPVIEW
        and request that both the entire map and mask regions be written to
        files. You then will have an edited mask file encompassing the model,
        and the corresponding submap file. The last step is to convert the
        edited mask to a full cell mask. To do this, run BLDCEL inputting the
        submap, corresponding edited mask and original FSFOUR map. The output
        map file can be deleted, but the output mask file will be a full cell
        version of the edited, model based mask which now also obeys space
        group symmetry. It can then be used for solvent flattening (for
        example, replacing the mask3.14 file in the cycle16.sh, extnd.sh or
        extndavg.sh procedures), and also can be examined in MAPVIEW as
        described earlier.
 

        8.00      INCORPORATION OF PARTIAL STRUCTURE INFORMATION 
        
           In many cases a significant fraction of the structure can be 
        reliably determined from an electron density map, but some regions in 
        the map are less well defined. In that case it is often useful to 
        incorporate phase information obtained from the partial structure into 
        the phasing process. This can be done in several ways, all of which 
        require running the PHASIT program once (in SF calculation mode, 
        IHLCF=0, ISIGA=0) to generate partial structure phases and
        amplitudes, and running the BNDRY program once (option 3, with
        ICMB=0 or 1) to combine the partial structure phase information with
        prior phase probability distributions cast in terms of Hendrickson-
        Lattman coefficients. Different strategies can be employed depending
        on which prior distributions are used, weighting during the phase
        combination and what is done AFTER the phase combination step. The
        most common procedures are now described.
        
        In all procedures, first run PHASIT in SF calculation mode using 
        IHLCF=0 and ISIGA=0, and call the output phase file MODEL.31.
        This file contains the partial structure phase and amplitude
        information. 
        
        Now you have some choices.
        
        1) Combine the partial structure information with the original (MIR, 
        SIR etc) probability distributions (usually in file called PHASIT.31 
        generated by PHASIT, but possibly introduced via the IMPORT program). 
        This can be done with a small control file to run BNDRY, option 3,
        using ICMB=0 or 1 for either Sim or Sigma_A weighting, respectively,
        (see BNDRY write-up) during phase combination. Call the output file
        PHICOMBINED.ORG. This file will contain phase, figure of merit and
        HL coefficients for the COMBINED data. If the partial structure was
        large enough, you may be able to use these phases directly to get a
        good map. 
        
        2) If you want to proceed with solvent flattening cycles, just copy 
        the file PHICOMBINED.ORG to PHASIT.31 (first saving the ORIGINAL 
        PHASIT.31 i.e. no partial structure contributions, in another file). 
        Now you can invoke the default procedure DOALL.COM without changing 
        anything, and at each phase combination step the MODEL+SIR etc 
        distributions will serve as the "anchored" phases with which those 
        newly obtained from solvent flattening will be combined.
        
        3) If you wish instead to combine the partial structure information 
        with distributions obtained AFTER solvent flattening, do the same as 
        in 1), but use the best phases available (usually in file obtained 
        from a previous run called phi16cy.31, phiextnd.31 etc) instead of the 
        original PHASIT.31 file. Call the output file PHICOMBINED.FIN. One 
        could then proceed with solvent flattening cycles as in 2), but 
        usually this is not necessary and the phases in file PHICOMBINED.FIN 
        are used for the final map.
        
        These 3 options (partial structure + MIR etc with no flattening, 
        [partial structure + MIR etc] followed by flattening, and parital 
        structure + flattened MIR) seem to be the most useful, and can all be 
        carried out without tampering with the default control files. One only 
        has to create additional small control files for single runs of PHASIT 
        (SF mode) and BNDRY (option 3). Other options making use of partial
        structure information are described in the section on "REDUCED BIAS
        NATIVE, COMBINED AND DIFFERENCE FOURIERS." 
        
        Note that in the case where a molecular replacement solution was 
        obtained, then one has no MIR like phase probability distributions to 
        combine solvent flattened (e.g. map inverted) phases with. In that
        case (or if one has MIR phase information, and simply wants to abandon
        it), one can use PHASIT in SF calculation mode but with IHLCF=1 and
        ISIGA=0 or 1. That will create Hendrickson-Lattman coefficients for
        the partial structure, and these distributions can then be used as the
        "anchored" phases with which those newly obtained from solvent
        flattening will be combined. This may also be useful if want wants to
        tie noncrystallographic symmetry averaged phases to model phases.
        
        Another option is available involving phase extension. Suppose one has 
        MIR, SIR etc data to only 4.0 angstrom resolution, native data to 3.0 
        angstrom resolution, and a partial structure available.  One could 
        first compute the partial structure phases out to 3.0 angstrom 
        resolution (PHASIT SF mode, IHLCF=0, ISIGA=0) and MIR etc phases out
        to 4.0 angstrom resolution (PHASIT, phasing mode). Then run MISSNG to
        get the file "extrfl.d" containing reflections between 4 and 3
        angstroms. The three output files could then be combined in a single
        run of BNDRY (option 3, ICMB=0 or 1, with phase extension requested)
        to get a hybrid file. The output file would then contain MIR combined
        with partial structure phases to 4 angstroms, and partial structure
        phases between 4 and 3 angstroms. This file could then be used for
        direct calculation of a map or to initiate solvent flattening cycles
        as described earlier. Yet another variation would be to do the same
        thing but requesting IHLCF=1 and ISIGA=0 or 1. In that case during
        solvent flattening iterations the map inverted phases would also be
        tethered to the partial structure phases for the high resolution
        data. 

            Clearly many other options or sequences are available. The key to
        successful use of the programs in this fashion is understanding that 
        the phase combination program (BNDRY, option 3) merges at least two 
        files, one of which must contain phase information cast in 
        Hendrickson-Lattman coefficients, and the other containing only 
        calculated phases and amplitudes (possibly to higher resolution). If 
        phase extension is also desired, a third file with the additional
        reflections may contain only indices and amplitudes, but it may also 
        contain phase probability distribution coefficients for some or all
        of the reflections. The output file always contains the COMBINED
        information cast in the HL coefficient form. It is thus suitable for
        use either for direct map calculations or as an input file for the
        BNDRY, MISSNG, MRGDF, MRGBDF, RD31 etc programs.

        9.00  REDUCED BIAS NATIVE, COMBINED AND DIFFERENCE FOURIER MAPS

           When making use of partial structure information, either obtained
        from a model via structure factor calculations or from inversion of
        a density map, the resulting phases are always biased towards the
        partial structure. Read (Acta Cryst. A42, 140-149, 1986) has shown
        how this bias can be reduced significantly when using the partial
        structure phases directly for map calculations, and how to properly
        weight partial structure derived phase information when combining
        it with other (e.g MIR, SIR etc.) phase information. Both procedures
        require first determining "Sigma_A", which is related to the 
        contributions from "missing" or "incorrect" parts of the structure
        and varies with resolution, to compute the proper weight (and thus
        FOM) for the partial structure phases. The procedures described by
        Read have been implemented as options in both PHASIT and BNDRY,
        and can be invoked as follows:


        (1) COMBINED PHASE MAPS 

        For simple phase combination the Sigma_A procedure can be invoked in
        the BNDRY program (option 3), by setting ICMB=1. Sigma_A weighting is
        then used instead of Bricogne's modification of Sim's weighting
        scheme during phase combination. This can be used either with model
        phases or map inverted phases, and thus can be done automatically
        in the "doall" or "extndavg" procedures. 

       
        (2) REDUCED BIAS DIFFERENCE MAPS

        These maps are similar to conventional Fo-Fc maps phased with the
        partial structure, but the coefficients are

        FOM * FOBS - D * FCALC   *  exp(i * phicalc)

        where D is derived from the Sigma_A values and phicalc is the phase
        from the partial structure. The appropriate map can be produced by
        running PHASIT, SF mode with ISIGA=2, and then requesting a Fo-Fc map
        in FSFOUR. Note however, that if chosen, FOM*FOBS and D*FCALC will
        occupy the Fo and Fc slots in the output file, thus other map types
        requiring pure Fo and/or pure Fc values will be inaccessible.   


        (3) REDUCED BIAS NATIVE MAPS
       
        These maps are similar to conventional 2Fo-Fc maps phased with the
        partial structure, but the coefficients are

        2 * FOM * FOBS - D * FCALC   *  exp(i * phicalc)
        
        for acentric reflections where D is derived from Sigma_A and

            FOM * FOBS *  exp(i * phicalc)

        for centric reflections.             

        The appropriate map can be produced by running PHASIT, SF mode setting
        ISIGA=3, and then requesting a 2Fo-Fc map in FSFOUR. Note however,
        that if chosen, FOM*FOBS and D*FCALC (acentric) or FOM*FOBS/2 and 0
        (centric) will occupy the Fo and Fc slots in the output file, thus
        other map types requiring pure Fo and/or pure Fc values will be
        inaccessible.
 

        (4) SIGMA_A WEIGHTED PROBABILITY DISTRIBUTION COEFFICIENTS

        Phase probability distribution coefficients and corresponding FOM
        based on Sigma_A weighting for structure factors computed entirely
        from an atomic model can be obtained by running PHASIT, SF mode with
        ISIGA=1. The file may then be used as the "anchor" phases to which map
        inverted phases are tethered. It is particularly useful for merging
        information during phase extension when high resolution phases come
        from a partial structure and lower resolution phases come from MIR
        type calculations. 

        10.00  INCORPORATION OF NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING 
        
           Whenever there are multiple copies of identical molecules present 
        in the crystallographic asymmetric unit and/or the same molecule is
        present in multiple crystal forms, one has the opportunity to 
        improve the phases by averaging the corresponding electron density in 
        the related molecules, replacing the density for each molecule with 
        the average, and inverting the "averaged" density map(s) to obtain new 
        structure factor amplitudes and phases. These new amplitudes and 
        phases can then be accepted immediately, but are more frequently 
        combined with the original MIR, SIR etc phase information in a 
        probabilistic manner, just like those obtained from solvent flattening 
        or from a partial structure. Indeed, solvent flattening and imposition 
        of non-negativity of electron density can be applied in addition to 
        the noncrystallographic symmetry averaging, leading to powerful 
        phasing algorithms. The resulting phases (either alone or combined 
        with MIR, SIR etc), are typically combined with the observed 
        amplitudes, and the process is cycled until convergence is obtained. 
        The power of the method increases as the number of molecules averaged 
        increases, but averaging over even a dimer is still extremely useful 
        when combined with MIR, SIR data, etc. Programs required to carry out 
        the steps needed for successful noncrystallographic symmetry averaging 
        are currently included in the PHASES package, and sample control 
        scripts are given (called "extndavg.sh and extndavg_mc.sh", for the
        single and multiple crystal averaging cases, respectively) which
        replace the "extnd.sh" script in a normal solvent flattening run. The
        scripts insert the averaging related steps into the normal solvent
        flattening process, thus the complete multi-cycle task can be carried
        out by executing them. Prior to running the scripts however, there are
        several related tasks to be performed, which include determination of
        the location, direction and nature (rotational order) of the
        noncrystallographic symmetry operator(s), and construction of one or
        more "averaging envelopes" or "averaging masks" delineating the
        volume(s) occupied by the molecules to be averaged. Initial estimates
        for the noncrystallographic symmetry operator(s) are usually obtained
        from rotation/translation functions which are not included in the
        PHASES package as they are readily available elsewhere, however if the
        operators are specified by 3x3 rotation matrices and 3 element 
        translation vectors (as for example, in the program "O"), then the
        PHASES program O_TO_SP can be used to convert them to PHASES format.
        Everything else, including refinement of the operator(s) and
        construction of the envelope mask(s) is part of PHASES. All map
        interpolation programs (MAPAVG, SKEW, MAPORTH etc) utilize powerful 64
        point spline algorithms, thus the map grids for averaging need not be
        any finer than for normal calculations. Many of the noncrystallographic
        symmetry averaging routines in PHASES were derived from programs
        originally written by W. Hendrickson & J. Smith. In most instances they
        have been heavily modified for use in PHASES, mostly to generalize the
        algorithms, to optimize the code, and to provide compatabilty with the
        rest of the package. The general averaging process as implemented in 
        PHASES is described below.  
        
           For both simplicity and reasons related to computational 
        efficiency, all of the averaging related calculations are best 
        performed on electron density "submaps," which cover only the map 
        region encompassing an asymmetric unit containing the molecules to be 
        averaged. This "asymmetric unit" need not be complete in the 
        crystallographic sense (that is, it may differ from a true asymmetric 
        unit in volume and have irregular borders), but it must encompass at 
        least the molecules to be averaged, although solvent regions may be 
        omitted. It may also span cell edges, if necessary. Since the standard 
        FSFOUR maps always cover a complete unit cell, the "submaps" (which 
        have a different format) can be created from them via the programs 
        MAPVIEW or EXTRMAP. Indeed, MAPVIEW will almost certainly be needed to 
        determine which region to extract in the first place. All envelope 
        creation, averaging, operator refinement, skewing etc will be done 
        using the submaps. After appropriate regions in the submaps are 
        averaged, program BLDCEL is used to regenerate complete unit cell maps 
        (FSFOUR format) conforming to the space group symmetry, which can then 
        be inverted by MAPINV. Thus MAPVIEW (or EXTRMAP) and BLDCEL serve as 
        the gateways between normal FSFOUR maps and submaps. Note that MAPVIEW 
        can display either type of map (and mask). Descriptions of the inputs 
        required for each of the programs mentioned can be found in the 
        appropriate program write-ups. 
        
             The keys to successful averaging are to obtain good "envelope" 
        masks which accurately identify the volume(s) in space in which the 
        noncrystallographic symmetry operator(s) is/are valid, and to obtain 
        accurate values for the operators themselves. These tasks always will 
        take one of two routes, depending on the nature of the 
        noncrystallographic symmetry. Within a given crystal, if the NC
        symmetry is purely rotational with the order of rotation being N-FOLD,
        where N is a small integer, then the task is simplified since one
        needs only a single "envelope mask" which encompasses all N of the
        molecules related by NC symmetry. That is to say the averaging can be
        done without having to specify where one molecule stops and the next
        starts. One only needs to know the bounds of the TOTAL AGGREGATION of
        molecules. The procedure (A) below is then adequate to carry out the
        necessary computations. If an arbitrary rotation angle and/or a
        translational (eg screw like) shift is involved, the task is more
        complicated since one then must create a SEPARATE ENVELOPE MASK
        identifying each molecule. The procedure (B) below is then adequate
        to carry out the computations. For multiple crystal averaging the same
        steps and considerations are required, but multiple submaps (one for
        each crystal form, along with corresponding envelope masks) are used.
        Details related to multiple crystal averaging are described later.
      
         (A)  MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH PURE
              ROTATIONAL SYMMETRY OF ORDER N
        
        1) start with best possible map (usually solvent flattened MIR map, as
           obtained via the "doall" procedure).
        
        2) compute a map via "FSFOUR" (default orientation, i.e NORN=0)
        
        3) run EXTRMAP (or MAPVIEW) to extract a submap from the FSFOUR map 
           which encompasses at least the dimer, trimer etc, related by KNOWN 
           (at least approximately) noncrystallographic symmetry.
        
        4) if the unit cell is not orthogonal, run MAPORTH to convert the 
           submap to an orthogonal grid (but save the input submap as well)
        
        5) run LSQROT (using orthogonal map), to refine the noncrystallographic
           symmetry axis location and direction. Start with low resolution 
           (~6A map, 2A grid) refining only within a sphere of suitable radius
           (usually 12-25A), centered about a point on the rotation axis which
           is near the dimer, trimer etc center. Then gradually extend the map
           resolution to about 3A (1A grid) and repeat the refinement. In a 4A
           map, the correlation coefficient after refinement should be about
           0.4 or higher. (Ignore the R factor, its always very high).
        
        6) run SKEW (using the submap from 3), to generate a "skewed" map 
           with new "b" axis aligned with noncrystallographic symmetry axis.
         
        7) run MAPVIEW (using "skewed" map) to create a mask (via "trace mask" 
           option) which encompasses only the region to be averaged. This 
           should include the entire dimer, trimer etc. In MAPVIEW, use only a 
           single mask (Mask No. 1). When exiting, save the "skewed" mask file.
        
        8) run TRNMSK (using both the original submap from 3, and "skewed" 
           mask from 7 to convert the skewed mask to one corresponding to the 
           default (non-skewed) orientation (its grid will have one-to-one 
           correspondence with the original submap). Save this standard mask.
        
        9) run MAPVIEW (using the original submap from 3), and "recover" the
           standard mask file from 8. Then use "Make Asu" option, and possibly
           edit masks until only non redundant density associated with the
           desired dimer, trimer etc is within the mask. When exiting, save the
           ENTIRE mask (no subset). It will be used in all future averaging
           cycles. 
        
           Optionally, run LSQROT again this time using the default mask 
        output from 9 as basis for refinement (you may have to orthogonalize 
        it), instead of a sphere. If you do this, expect a drop in the
        correlation coefficient. If the orientation changes significantly,
        repeat steps 6-9. 
        
        Proceed to AVERAGING STEPS
        
        
        (B) MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH ARBITRARY
            ROTATIONAL ANGLE AND/OR TRANSLATION
        
        Steps 1-4 same as in (A)
        
        5) Run LSQROTGEN (using orthogonal map), to refine the 
           noncrystallographic symmetry operators relating molecule 1 
           (arbitrarily selected) to each other molecule. Start with low 
           resolution (~6A map, 2A grid) refining only within spheres of
           suitable radius (typically 15A) centered on points near the centers
           of molecule 1 and the target molecule, respectively. Then gradually
           extend the map resolution to about 3A (1A grid) and repeat the
           refinement. In a 4A map, the correlation coefficient after
           refinement should be about 0.4 or higher. (Ignore the R factor, its
           always very high). For N related molecules, there will be N-1 
           operators to refine.
        
        6) Run MAPVIEW (using the submap from 3) to create SEPARATE envelope
           masks for EACH MOLECULE to be averaged. Do this by making use of
           the "set mask no." and "trace mask" options. When exiting, save 
           the mask file, as it now contains separate envelope information
           for each molecule. Also, remember which mask No. you assigned to
           which molecule.
        
        7) Run MAPVIEW (using original submap from 3), and "recover" the
           standard mask file from 6. Then use "Make Asu" option, and possibly
           edit masks until only non redundant density associated with the
           desired dimer, trimer etc is within molecular envelope masks. When
           exiting, save the ENTIRE mask (no subset). It will be used in all
           future averaging cycles. 
        
        Optionally, run LSQROTGEN again this time using the default mask 
        output from 7 as basis for refinement (you may have to orthogonalize 
        it), instead of spheres. If you do this, expect a drop in the
        correlation coefficient. If the operator(s) change significantly,
        repeat steps 6-7, otherwise continue. 
        
          
                                    AVERAGING STEPS
        
           Prior to brute force cycling, run MAPAVG (using the original submap 
        from 3, and the corresponding mask from 9A or 7B) to generate an 
        "averaged" map. If the translation is small (or absent) use "SKEW" to 
        convert it so you can look down the NC symmetry axis. You can then use 
        "MAPVIEW" to view the map, and verify that averaging has indeed been 
        done successfully, that you are in fact looking down the NC symmetry
        direction, and the axis goes through the origin.  If so, proceed to
        averaging cycles. If not, something went wrong earlier. Check program
        inputs, outputs, polar axis conventions, etc.
        
           At this point refined values of the noncrystallographic symmetry 
        operator(s) are available, along with envelope masks isolating the 
        regions to be averaged within the submap. 
        
        
        1) create the file "extrmap.d", which will specify what submap region 
           to extract from the FSFOUR map. It MUST correspond EXACTLY to the 
           same region used when creating the envelope masks. (You can read
           the envelope mask header with RDHEAD if you forgot). Rename the
           final mask file "asu.msk" See EXTRMAP write-up for information.
        
        2) create the file "mapavg.d", to specify the transformation 
           operator(s) for averaging, and the envelope mask file. See MAPAVG 
           write-up for information.
        
        3) create the file "bldcel.d", to specify the file names and options.
           BLDCEL will take the "averaged" asymmetric unit submap from mapavg, 
           and build a complete cell FSFOUR style map from it. See BLDCEL 
           write-up.

        4) Create the file "sloext.d" specifying phase extension information
           and cycles to be performed (see SLOEXT write-up). If no phase
           extension is to be done, make the upper and lower resolution 
           cutoffs identical and specify 16 cycles. Otherwise, specify the
           resolution cutoffs and cycles per resolution increment, and run
           MISSNG to create the "extrfl.d" file.   
       
        5) Create the file "extnd.d" specifying file names, extension options
           and I/O type.
 
        6) Verify that the phase files (phasit.31 and phi16cy.31), solvent
           mask (mask3.14), and data files (bnd2.d, fft.d, minv2.d) from a
           previous "doall" run are available. 
        
        7) Run the procedure "extndavg.sh". It will carry out the cycles of
           NC symmetry averaging/solvent flattening/phase combination/phase 
           extension steps to combine "averaged" phases with the original MIR 
           phases. 
     

             ***** CREATING AVERAGING ENVELOPE MASKS FROM A MODEL *****

           If coordinates from a tentative model are available, they can also 
        be used to create the averaging envelope masks. The procedure is
        esssentially that described in the CREATING/EDITING SOLVENT MASKS
        section, with a couple of minor exceptions. First, after the initial
        mask is constructed in MDLMSK and edited in MAPVIEW as described,
        one is finished since unlike solvent masks, there is no need to make
        "full cell" averaging masks with BLDCEL. Second, if the NC symmetry
        operation involves arbitrary rotations and/or post rotation
        translations, then MDLMSK must be run multiple times; once for each
        NC symmetry related molecule. In each run a separate file should be
        written and a different mask number must be used, but each file must
        cover the same range (which is large enough to cover ALL copies). The
        particular mask numbers used must be remembered as they will be needed
        later when specifying which transformation operators are to be used in
        MAPAVG to relate the molecules. The individual mask files should then
        be edited and saved as described in the CREATING/EDITING SOLVENT MASKS
        section. Once edited, the individual mask files must be combined into
        a single mask file with program MRGMSK. The output file from MRGMSK
        then can be used for averaging (i.e. as "asu.msk" in the "extndavg.sh"
        procedure).
          The output masks from MDLMSK or MRGMSK can be used for averaging 
        as long as a corresponding map region is provided. Thus the input used
        to create the submaps ("extrmap.d" in the "extndavg.sh" procedure)
        must specify the same range, and the map must have the same periods.
        If the "extndavg.sh" procedure is to be used, then the solvent mask
        must also be created from a map having the same periods. The output
        masks can be examined/edited in MAPVIEW, again as long as the
        corresponding map region (either explicitly selected from a FSFOUR map
        in MAPVIEW, or previously extracted from a FSFOUR map by EXTRMAP and 
        input to MAPVIEW as a non-FSFOUR map) is provided. Once the output
        masks from MDLMSK or MRGMSK are obtained, they can be used just like
        any other averaging mask file, i.e. used for operator refinement in
        LSQROT or LSQROTGEN, used in MAPAVG, manipulated in SKEW, BLDCEL etc.

        10.01               AVERAGING WITH MULTIPLE CRYSTALS

           All of the submap extraction and mask preparation steps used in
        single crystal averaging as described earlier must be carried out
        independently for each crystal, thus multiple submap and corresponding
        mask files must be created. If a given crystal also contains multiple
        NC symmetry related copies WITHIN IT, then the operators relating
        molecule 1 to each of them must also be refined exactly as described
        in the single crystal case. This will allow both intra and inter
        crystal averaging to be carried out simultaneously. In addition, the
        operators relating MOLECULE 1 in CRYSTAL 1 to MOLECULE 1 in EACH
        OTHER CRYSTAL must also be refined. This can be done in program
        LSQROTGEN by specifying the appropriate input. Once all of the
        required operators and envelope masks are obtained, averaging can
        proceed by specifying the appropriate input to program MAPAVG, and
        by preparing the required input files for EXTRMAP and BLDCEL for
        each crystal. A script file "extndavg_mc.sh" is supplied for multiple
        crystal averaging in the case where there are two crystals. It
        can easily be modified to include more crystals (up to 6), and
        comments are embedded in it explaining where modifications are to
        be made. The main difference in the procedure for multiple crystal
        averaging is that all of the normal input files must be duplicated for
        each crystal, and the standard file names for maps, masks, data files
        etc must be be modified to uniquely identify the appropriate crystal.
        During each cycle of multiple crystal averaging the full cell maps are
        created and the submaps are extracted independently for each crystal.
        Then an averaged version (averaged over ALL copies) is created for
        each submap. For each crystal, the averaged submap is then expanded to
        its full cell version, solvent flattened, Fourier inverted and the
        resulting phases and amplitudes combined probabilistically with the
        appropriate MIR or SIR phases. Thus a new improved map can be obtained
        for each crystal. With multiple crystal averaging however, there is
        currently no facility for slow phase extension, thus the file
        "sloext.d" is not needed and the number of cycles to be done is hard
        wired into the "extndavg_mc.sh" script. Phase extension however, still
        can be done. It's just that the appropriate cutoffs are supplied only
        in the "extnd.d" files for each crystal, and are constant for all
        iterations. One can of course, still extend the resolution gradually
        by repeating the process with iteratively with different cutoffs and
        input files. 

        10.02             AVERAGING DIFFERENCE OR 2FO-FC MAPS

           One usually does the averaging/solvent flattening iterations on
        normal electron density maps, but in some cases it may be desirable
        to average FO-FC or 2FO-FC maps. Examples might be when trying to
        identify inhibitors, activators etc. soaked in to known crystal
        structures, or when trying to build up density for missing sections
        of the macromolecule itself. This can be accomplished by proper
        preparation of the input files, and changing the map type
        specification in the fft.d input file. To do this one must assure
        that both FO and FC are available on the INITIAL file (called
        phi16cy.31 in the "extndavg.sh" or "extndavg_mc.sh" scripts) used to
        create the first map, and on the OUTPUT file ("newphi.ref" or
        "newphi_N.ref," etc.) produced in each iteration. For the output
        files this is done by specifying IOTYP=1 in the BNDRY option 3 input
        ("bnd3.d" or "bnd3_N.d" etc.). For the INITIAL file, one could
        obtain it from a single run of BNDRY, option 3, again specifying
        IOTYP=1, or from a run of PHASIT, structure factor mode specifying
        IHLCF=0 and ISIGA=0, depending on whether the phase information comes
        from MIR type calculations or from atomic coordinates for a model.
        Note however, that the "anchor" phase file (called "phasit.31" or
        "phasit_N.31" etc.) which the map inverted phases will be combined
        with MUST contain FM*FO and FO in the amplitude slots along with
        probability distribution coeficients, as would be the case if the
        file was created with a normal PHASIT run in protein phasing mode (or
        structure factor mode if the "long format" output was requested). As
        long as these files are properly prepared and the appropriate
        coefficients are selected in the fft input, iterations using map
        types involving FC's will be obtainable. One must be aware however,
        that the final output file (phiextndavg.31) will then also have
        FO and FC in the amplitude slots, and thus can only be used in
        FSFOUR for straight or difference type Fouriers, and NOT for 
        figure of merit weighted Fouriers. 
          If one is averaging with molecular replacement derived phase
        information and has already proceeded as described in the DENSITY
        MODIFICATION WITH MOLECULAR REPLACEMENT DERIVED PHASE INFORMATION
        section, i.e. the "doall" procedure has been run using the modified
        inputs, one need only to change the filename "phasit.31" in the
        "extnd.d" input file to "anchor.31". Then averaging can proceed with
        the "extndavg.sh" script using all of the preexisting files, once the
        averaging mask, extrmap.d, mapavg.d., bldcel.d sloext.d (and possibly
        extrfl.d) files are created.  

        10.03             SAMPLE INPUT FILES FOR AVERAGING 

         ***** SAMPLE INPUT FILES FOR AVERAGING WITHIN ONE CRYSTAL *****
        
           Sample input files for the averaging steps follow, along with a 
        listing of the supplied template command files "extndavg.sh" and
        "extndavg.com". The command files can be used in place of the normal
        "extnd.sh" or "extnd.com" file in a solvent levelling run. They will
        perform additional cycles of averaging/solvent flattening/phase
        combination/extension starting with the phases in file "phi16cy.31",
        combining the "averaged" phases with MIR, SIR phase information in
        file "phasit.31", and extending phases to additional amplitudes on
        file "extrfl.d". They assume that all of the files needed for a 
        normal solvent flattening run (fft.d, bnd2.d, etc) are available, and
        that the third mask from a previous run (mask3.14) is still available
        for solvent flattening. If the template script file is to be used
        unchanged, then all filenames should be EXACTLY as in the examples
        (except for standard parameter file). Only the data relating to
        submap ranges, resolution limits, number of cycles and the NC
        operators should be changed. The final phases will be written to file
        "phiextndavg.31", and printed information to "extndavg.l". The
        procedure is run simply by entering "sh extndavg.sh" (UNIX) or
        "@EXTNDAVG.COM" (VMS). 
       

        -- Sample input file extrmap.d or extrmap.dat, to extract submap ----
        
         pdc.pam
         four.map
         asu.map 
         -.42 .45 -.45 .42 -.08 .56
        
          
        
         --- Sample input file mapavg.d, for averaging over pure twofold -----
        
         pdc.pam
         1
         asu.map
         asu.msk
         asu.avg
         2 1 1
         -102.16 83.81 180.0 1.082 -.746 .316 0.0  
        
        
        
         --- Sample input file bldcel.d, builds complete cell from averaged 
                                                  submap ---
         pdc.pam
         four.map
         avgcell.map
         asu.avg
         asu.msk
         0

       
         --- Sample input file extnd.d, specifying phase combination 
                                                  data and options ---

         pdc.pam
         3
         1 2.75 1. 0 0
         phasit.31
         minv.ref
         extrfl.d
         newphi.ref

         Note that if one doe NOT want to include phase extension, then the
         first "1" should be changed to zero and the line containing
         "extrfl.d" should be omitted (see BNDRY write-up).


 
         --- Sample input file sloext.d, controlling no. of averaging cycles
                                         and phase extension information --- 

         pdc.pam
         3. 2.75 8
         extnd.d

         Note that if one doe NOT want to include phase extension, then the
         value of the two resolution limits should be made equal and the 8
         changed to 16 to do a total of 16 averaging cycles (see SLOEXT
         write-up).  

                  ************* procedure extndavg.sh ***************
        
        # MODIFIED TO INCLUDE NONCRYSTALLO