2.02                      ******************
                                  * BNDRY WRITE-UP *
                                  ******************
        
           BNDRY is a program to facillitate solvent flattening, negative 
        density truncation and/or phase combination. If the starting phases 
        come from a single derivative or single wavelength anomalous 
        scattering data, then it can be used to carry out Wang's ISIR/ISAS 
        procedure. If the starting phases come from MIR or any other source, 
        it can be used for "phase refinement" via solvent flattening and/or 
        negative density truncation. The program can also be used for phase 
        extension to higher resolution or to missing reflections within the 
        input resolution, or for combining partial structure information with 
        MIR, SIR etc data. BNDRY can be run in one of four modes, with each 
        mode carrying out a particular function. The mode desired is selected 
        via an input parameter. The input required, output generated and 
        tasks performed for each mode are now described.
        
        INPUT CONTROL DATA (UNIT 5)
                                                                           
        RECORD 1     PAMFIL                           (free format)
                                  
                     PAMFIL = Name of file specifying cell parameters,
                              symmetry information etc.
        
        RECORD 2       IOPT                             (free format)
                         
                       IOPT = 0  Convert input Fourier coefficients to those
                                 appropriate for protein-solvent boundary
                                 determination.
            
                            = 1  Construct mask map identifying protein and
                                 solvent regions from "smeared" map input.
            
                            = 2  Modify electron density map via solvent
                                 leveling and negative density truncation.
            
                            = 3  Combine phase information from new source
                                 with input Hendrickson-Lattman representation 
                                 of phase probability distributions. 
            
           Each run of the program should invoke only one of the options. 
        Depending on the option called, the following data must also be given. 
            
                                ***** for IOPT=0 *****
        
        RECORD 3    RAD   (free format)
                       
                        RAD = sphere radius, for weighted averaging of density
                              in boundary determination (typically 2.5-3 times 
                              minimum d spacing)
                                              
        
        RECORD 4   INPREF (free format)
                                        
                     INPREF = name of input reflection/phase file. This file
                              should have been generated by creating a map with
                              the desired starting phases, setting all density
                              values (omitting the F000 term) < 0 to 0, and
                              inverting it.
        
        
        RECORD 5    OUTREF (free format)
        
                 OUTREF = name of output reflection/phase file. A map generated
                          with this data is "locally averaged", or "smeared",
                          and is appropriate for protein/solvent boundary
                          determination.
                                 
                                    **** FILES ****
        
        INPREF = (binary file) =  Input Fourier coefficients, from inversion of
                                  original electron density after negative 
                                  density truncation, as computed by program 
                                  MAPINV.  One reflection per record 
                                  containing H, K, L, FO, FC, PHI   where  
                                  H, K, L = INTEGERS, FC,PHI=REALS and PHI is 
                                  in degrees. FO not used. 
              
        OUTREF = (binary file) =  Output Fourier coefficients, same form as 
                                  INPREF but modified to correspond to 
                                  "smeared" map appropriate for boundary 
                                  determination.
            
                                     
        
                                *****  for IOPT=1 *****
        
        RECORD 3    MAPFL1     (free format)
        
                    MAPFL1 = Name of input "locally averaged" or "smeared" map
                             file. This map should have been produced by FSFOUR
                             from the coefficients in file OUTREF (option 0).
        
        
        RECORD 4    MSKFIL     (free format)
        
                    MSKFIL = Name of output mask file.
        
        
        RECORD 5    PSOL   (free format)
                    
                    PSOL = bulk solvent fraction (by volume, typically 0.3-0.6)
                           If the bulk solvent fraction is unknown, it can be
                           estimated reasonably well by the formula
                                
                            PSOL= 0.97 - (1.22 * Z * Mw)/V
        
                           where V is the unit cell volume (cubic angstroms),
                           Mw is the molecular weight of the protein (Daltons),
                           and Z is the number of protein molecules of weight
                           Mw in the unit cell. This assumes a standard value
                           for protein volume, and that 3% of the solvent is
                           tightly bound to protein.
        
                                    **** FILES ****
        
        MAPFL1 = (binary file) =  input electron density map ("smeared" map) 
                                  as computed by program FSFOUR from 
                                  coefficients generated in option 0.
            
        MSKFIL = (binary file) =  output mask map file. After one header 
                                  record, each subsequent record corresponds 
                                  to one input record of the map (as type 
                                  BYTE). Values of 2 indicate solvent region, 
                                  anything else indicates protein region. Note 
                                  that the mask can be displayed/edited by 
                                  program MAPVIEW.
            
            
        
                                ***** for IOPT=2 *****
        
        RECORD 3     MAPFL1  (free format)
        
                     MAPFL1 = Name of input electron density map file to be
                              modified.
                                                                    
        
        RECORD 4     MSKFIL  (free format)
                                        
                     MSKFIL = Name of input protein/solvent mask file.
        
                                                
        RECORD 5     MAPFL2  (free format)
        
                     MAPFL2 = Name of output (modified) electron density map
                              file.
                                       
        
        RECORD 6     SVAL    (free format)
                                        
                     SVAL = empiracal constant, used to approximate F000/V
                               (See program desciption that follows)  
            
                               Typical values:  .060  (3.0 Angstrom data)
                                                .086  (3.5 Angstrom data)
                                                .112  (4.0 Angstrom data)
                                                .250  (6.0 Angstrom data)
            
        
                                    **** FILES ****
         
        MAPFL1 = (binary file) =  input original (unmodified) electron density
                                  map as generated by program FSFOUR
            
        MSKFIL = (binary file) =  input mask map (from run with IOPT=1)
            
        MAPFL2 = (binary file) =  output modified electron density map (same 
                                  structure as input map)
                                                             
        
        
                                *****  for IOPT=3 *****
        
        RECORD 3    IEXT, DCUT, DAMP, ICMB, IOTYP  (free format)
        
                    IEXT = 0  For no phase extension.
                         = 1  To extend phases to additional amplitudes given 
                              on file EXTREF (up to DCUT resolution).
                         = 2  Same as 1, but phases AND AMPLITUDES also 
                              generated for any other missing reflections up 
                              to DCUT angstrom resolution.
            
                    DCUT =    d spacing cutoff, in angstroms, for extension.
                              (note that extension only possible to reflection
                              index range (or to symmetry related reflections)
                              specified in input to MAPINV)
        
                    DAMP =    Damping factor, (in range 0-1.) for weighting
                              contribution of input probability distribution to
                              the combined distribution. Usually 1.0, which 
                              applies the true weight. If < 1., will downweight
                              contribution of original distribution, i.e. 
                              increase relative weight of new (map inversion 
                              or partial structure) distribution to the 
                              combined distribution.

                    ICMB = 0  To use Bricogne's modification of Sim's weighting
                              procedure during phase combination.
                         = 1  To use Read's Sigma_a procedure for weighting
                              during phase combination. 

                   IOTYP = 0  For normal output, i.e. phase file to contain
                              FOM*FO and FO in the amplitude slots.
                         = 1  For modified output, i.e. phase file to contain
                              FO and FC in the amplitude slots. (This is
                              used only if one wants to do NC symmetry
                              averaging on difference or 2FO-FC maps, or
                              solvent flattening on 2FO-FC maps).
                               
                                   
            
        RECORD 4  INPPRB      (free format)
        
                  INPPRB = Name of input phase probability distribution file.
                               
                                                    
        RECORD 5  INPFC       (free format)
        
                  INPFC  = Name of input calculated structure factor file.
                           (Obtained from inversion of modified map or computed
                           from a partial structure). 
                                        
                                                  
        RECORD 5A EXTREF      (free format)   
        
                   ****** include this record ONLY IF IEXT > 0 *****
        
        EXTREF = Name of input file containing additional structure
                 factor amplitudes (and possibly phase probability distribution
                 coefficients) for phase extension.
        
        
        RECORD 6  OUTPRB   (free format)
        
                  OUTPRB = Name of output phase probability distribution file,
                           corresponding to combined (and possibly phase 
                           extended) data.
            
                                    **** FILES ****
        
        INPPRB = (binary file) = Input Fourier and probability distribution 
                                 coefficients, one reflection per record 
                                 containing H,K,L,FM*FO,FO,PHI,IPRAB,IPRCD,MK,  
                                 FOM where H, K, L,IPRAB,IPRCD,MK= integers, 
                                 FO,FM*FO,PHI,FOM=real and PHI is in degrees. 
                                 This file usually is prepared by program 
                                 PHASIT or IMPORT. But if it is a previous 
                                 output from BNDRY, then new phase information
                                 can be combined with phases AFTER density 
                                 modification instead of with the original
                                 phases. In the latter case IOTYP should have
                                 been set to 0 when the file was originally
                                 created. 
        
        
        INPFC = (binary file) = Input Fourier coefficients (from inversion of 
                                modified map or from partial structure), with
                                records containing H,K,L,FO,FC,PHI as output 
                                from MAPINV or PHASIT.  FO is not used.
        
                                                     
        EXTREF = (ASCII, free format) = Input reflection file. Each record 
                                        should contain H, K, L, FNAT, A_B and
                                        C_D where H, K, L = INTEGERS, FNAT=REAL
                                        and A_B, C_D are INTEGERS. If phase
                                        probability distribution coefficients
                                        are available they are packed two per
                                        word in A_B and C_D as in a normal 
                                        "phased" file. If they are not 
                                        available A_B and C_D are zero. This
                                        file is needed only if IEXT > 0.
                                        Phases will be extended to these
                                        reflections, subject to DCUT criteria.
                                        This file usually is prepared by
                                        program MISSNG. 
            
        OUTPRB = (binary file) = Output Fourier coefficients, after 
                                 combination of phase information, with same 
                                 structure as on INPPRB, except that the 
                                 Hendrickson-Lattman coefficients, phase and 
                                 figure of merit correspond to the new phase. 
                                 Note that the FO entry will actually contain 
                                 FC for reflections which were "amplitude" 
                                 extended. If IOTYP=1, then the amplitude
                                 slots in each record will contain FO and FC
                                 instead of FM*FO and FO, enablng different
                                 types of Fourier maps to be computed during
                                 density modification runs, if desired.
                                 WARNING! The IOTYP=1 option is fine if the
                                 output file is to be used ONLY for Fourier
                                 calculations, but it must NOT be used later
                                 in BNDRY or PHASIT as an input file, since
                                 they expect FO to be picked up from the 
                                 second amplitude slot!   
        
        
                           **** BNDRY PROGRAM STRUCTURE ****
        
           Depending on the value of IOPT, the following events take place.
        
        For IOPT = 0            
        
        The input sphere radius RAD is read in along with the current 
        reflection data and phase information.  A unique set of reflections
        is selected and the Fourier transform of the weighting function
        W = 1-r/RAD  with W = 0. for r > RAD is then computed for each
        reflection. The transform FS is as follows
        
        A = 4 * pi * RAD * sin(theta)/lambda
        
                         3                                         4
        FS = 4 * pi * RAD * [ 2 * ( 1 - COS(A) ) - A * SIN(A) ] / A
        
        The input structure factor amplitudes are multiplied by FS, and the 
        modified data is written out.  In the original Wang algorithm, the 
        weighting function was applied by convolution with the electron 
        density in direct space, after zeroing out negative densities.  
        Instead, we zero out the negative densities, invert the truncated map 
        to obtain structure factors, multiply the structure factors by the 
        transform of the weighting function, and compute a new modified map 
        from the resulting modified structure factors.  This is identical to 
        the original method except that it is much more efficient, 
        particularly for large maps and/or large RAD. It also does not require 
        the time-saving approximation of repeating the procedure twice with 
        half of the desired RAD as was sometimes needed with the direct space 
        algorithm. The resulting coefficients will generate a map which is 
        equivalent to zeroing out negative densities and then taking a 
        weighted average (with weight W) of all density within RAD angstroms 
        of each grid point in the input map.  
        
        For IOPT = 1
        
        The input fractional volume known (or thought) to represent solvent is
        read in and converted to the corresponding number of grid points in 
        the map. The modified electron density map is then read and a 
        histogram is generated keeping track of how many grid points have 
        associated electron density of a given value. Starting from the lowest 
        density value, an electron density threshold RHOCUT is incremented 
        until the total number of grid points having density less than RHOCUT 
        equals the number of grid points representing solvent. The modified 
        electron density map is then rewound and read again, but this time as 
        each record is read, a "MASK" value is defined for each grid point
        depending on whether the corresponding density exceeds RHOCUT.  Mask 
        values of 2 represent the solvent region, anything else the protein 
        region.  The mask map is then output to a file. Note that the mask can 
        be displayed/edited with program MAPVIEW.
        
        For IOPT = 2
        
        An empiracal constant SVAL is read in and is used to approximate F000/V
        (on scale of input map) from the relationship
        
                            < Rho solvent > + F000/V
                           --------------------------     =  SVAL
                              Rho(Max)    +   F000/V
        
        An electron density map and a mask map are read in.  Using the mask map
        to discriminate protein and solvent regions, the mean density in the 
        solvent region is computed, and the maximum density in the protein 
        region is determined. From these three quantities F000/V (on scale of 
        input map) is then estimated from the relationship above.  A new 
        modified map is then constructed such that the electron density at 
        each grid point is equal to
        
        < Rho solvent > + F000/V     if in solvent region.
        
        Maximum of (Rho input) + F000/V   or zero   if in protein region.
        
        Thus solvent leveling and negative density truncation are enforced. 
        This is identical to the procedure in Wang's ISIR programs.
        
        
        For IOPT = 3
        
        Indices, structure factor amplitudes, Hendrickson-Lattman coefficients
        and restricted phase indicators are read in and stored for the 
        original set of phased reflections. If phase extension is requested
        then a file containing additional reflections for which amplitudes
        (and possibly phase probability distribution coefficients) are
        available is also read in and the data stored. Then new computed
        structure factors (either from inversion of a modified map or from a
        model based calculation) are read in. Indices and phases from the
        computed structure factors are transformed (if needed) to the standard
        asymmetric unit, systematic absences are rejected, phase restrictions
        are determined and the DCUT criteria is imposed (for phase extended
        reflections). The new indices are compared with those stored, and if a
        match is found the new phases and amplitudes are paired up with the
        old. If no match is found and AMPLITUDE extension was requested, the
        unpaired reflections are written to a scratch file.
        
        The FC's are then scaled to the FO's by least squares based on the
        original phased reflections, and the paired data are then sorted on 
        resolution and divided into ranges according to sin(theta)/lambda. If
        Sim weighting or AMPLITUDE extension is requested the mean
        sin(theta)/lambda and mean ABS(FO**2 - FC**2) are computed for each
        range, and a three term polynomial in sin(theta)/lambda is fit to the
        mean ABS(FO**2-FC**2) by least squares. If Sigma_A weighting is
        requested the ranges are then used to determine normalized structure
        factor amplitudes from which the Sigma_A values are derived.
        
        For all paired reflections, new Hendrickson-Lattman coefficients for
        the map inversion/partial structure data are computed according to 
        
        A = W * COS (Phi calc)
        
        B = W * SIN (Phi calc)

        C = 0

        D = 0

     
        where for Sim weighting

        W = 2 * FO * FC / < | FO**2 - FC**2 | >
                                               sin(theta)/lambda
       
        and for Sigma_A weighting

        W = 2. * Sigma_A * EO * EC / (1. - Sigma_A**2) for acentric data 

        W =      Sigma_A * EO * EC / (1. -Sigma_A**2) for centric data  
       
        Test calculations (on a single model structure) indicated that the
        Sigma_A weighting gave slightly worse results when used to combine
        solvent flattened and MIR phases, but slightly better results when
        combining model based phases with MIR phases. Regardless of weighting,
        the new coefficients are combined with any input coefficients.
        Prior to phase combination, the original input coefficients are damped
        (if desired by the user), to increase the relative contribution of the
        newly introduced (map inversion or model based) information. Usually
        the damping factor is 1. (no damping), but in cases where phase
        combination involves a fairly small (percentage wise) partial
        structure fragment, damping the input coefficients can increase the
        impact of the partial structure contribution. The combined phase
        probability distributions are then evaluated and integrated to yield
        "best" (centroid) phases and the associated figure of merit. The
        combined phase information is then written to the output file in one
        of two user selected formats, and summary statistics are listed which
        include R factors, correlation coefficients and mean figures of merit
        for original, extended and all reflections. It is important to note
        that when phasing with only anomalous scattering data at one
        wavelength, phase extension will ALWAYS be required to insure that
        centric reflections get phased. Also, when phase extension is requested
        and input reflection data for extension (generated by program MISSNG) 
        includes probability distributions, then the distributions will
        always be combined with those from the map inversion/partial 
        structure. For extended reflections input without distribution
        coefficients, the output phases will correspond exactly to those from
        the map inversion/partial structure. 
 
        If AMPLITUDE extension was requested, the previously written scratch
        file is rewound, read and the FC's rescaled (using the previously
        determined scale factor). After insuring that only unique reflections
        are selected, these data are also passed to the output file, except
        that the figure of merit and HL coefficients are computed using
        
        W =  FC * FC / < | FO**2 - FC**2 | >
                                               sin(theta)/lambda
        
        This option is generally used to fill in missing (usually low order) 
        reflections within the resolution range of the measured data, NOT for 
        extension to higher resolution. It is usually done only after 
        convergence is obtained with all other data.
        
        Use of the BNDRY program for density modification, control files and
        important considerations are discussed later where sample inputs are 
        given.