2.01                       ******************* 
                                   * PHASIT WRITE-UP *
                                   *******************
                                           
            PHASIT can be run in one of two modes, protein phasing mode or 
        structure factor calculation mode. Some of the input data is common to 
        both modes, but other data is needed only for the particular mode 
        invoked. First, the data that is always needed is described.
             
        INPUT DATA (UNIT 5) 
            
        CARD 1 -  PAMFIL                           (free format)
        
                                       PAMFIL = name of parameter file
                                                 containing cell and symmetry
                                                 information.
            
            
        CARD 2 -  MODE, NXSCAT                     (free format)
            
                                       MODE =   0 for protein phase 
                                                   calculations.
                                             =   1 for structure factor 
                                                   calculations.
                   
                                       NXSCAT = number of additional atomic
                                                 types for which scattering
                                                 factors will be input. Note
                                                 that 20 types are already
                                                 stored in the program (see
                                                 below), thus this is usually
                                                 nonzero only for exotic 
                                                 atoms or wavelengths other
                                                 than CU K alpha.
        
           The following block of cards should be included only if NXSCAT > 0
        
            Up to 5 additional atomic types may be input.  For each additional
        atomic type, include the following 3 records
        
        REC 1     (A(J),J=1,4)                       (free format)
        
                                             A(J) = Coefficients for analytical
                                                    approximation to scattering
                                                    factors, as in Int. Tables,
                                                    Vol IV, pages 99-101.
        
        REC  2    (B(J),J=1,4) , C                   (free format)
          
                                             B(J) = Coefficients for analytical
                                                    approximation to scattering
                                               C  = factors, as in Int. Tables,
                                                    Vol IV, pages 99-101.
        
        REC 3     DEL f' , DEL f''                   (free format)
        
                                         DEL f'  = real part of anomalous
                                                   scattering correction term.
        
                                         DEL f'' = imaginary part of anomalous
                                                   scattering correction term.
        
            
           The appropriate remaining data should be supplied only for the mode 
        selected.

        
            **** additional input for protein phasing mode  (MODE= 0 )****
            
        CARD 3 + 3*NXSCAT -   NSETS, NOREF, N             (free format)
               
                                      NSETS =  number of data sets 
                                              (derivatives)to use in phasing 
                                              (max = 30)
                                    
                                      NOREF = 0 for protein phase calculation
                                                only.
                                            = 1 for protein phase calculation
                                                plus "phase refinement" of
                                                derivative parameters.
                                     
                                          N = minimum number of contributing
                                              data sets for the phase of an
                                              acentric reflection to be output.
                                                            
        
                                     
         CARD 4 +3*NXSCAT  -  OUTREF                 (free format)
                                     
                              OUTREF = Name of output reflection file to 
                                       contain the final protein phases.
        
        
           The following block of cards 1-6, must then be repeated for each 
        data set
                                     
             1) TITLE  =  anything  (free format)
             
             2) FILEIN = input merged data filename (free format)
        
             3) FILOUT = output difference Fourier filename (free format)
    
             4) DCUT, SIGCUT, ISOFLG, SCLFPH, BOVFPH, SCLFH, ( EC(I),I=1,4 )
                    (free format)
                 
                                 DCUT = minimum allowed d spacing.
                       
                               SIGCUT = minimum allowed F/sig value.
            
                               ISOFLG = 0 for isomorphous replacement data.
                                      = 1 for native anomalous scattering data.
                                      = 2 for derivative anomalous scattering
                                          data.          
                         
                               SCLFPH = scale factor multiplying FPH (obs)
                                        to scale it to FP (obs). Usually =1.
                                        unless refined in previous run.
                                               
                               BOVFPH = overall thermal factor, applied to
                                        FPH (obs) to scale it to FP (obs).
                                        Applied as exp(BOVFPH*ssthol) * FPH.
                                        Usually = 0. unless refined in
                                        previous run. 
        
                                SCLFH = scale factor multiplying |FH|(calc)
                                        to scale it to the observed data.
                                        If unknown, input 0. and it will be
                                        computed. 
        
                        (EC(I),I=1,4) = coefficients for 3 term polynomial,
                                        used to generate "standard" E (lack
                                        of closure, based on intensity) 
                                        values as function of |FP|., and the
                                        minimum allowed value of E. If
                                        unknown, input 0. for each and they
                                        will be computed.
        
        
        5) NA  = (number of heavy atoms/anomalous scatterers with known
                 positions, free format)
            
        6 etc) ATNAME, X, Y, Z, B, OCC, ITYPE      FORMAT(7X,A8,5F10.5,I5)
            
          ATNAME = anything
            
           ITYPE = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 or 20
                   for C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4,
                   I-, Zn+2, Ca+2, Mg+2, Cd+2, U+6, P, Br-, Cl- or Sm+3,
                   respectively. ITYPE = 21 through 20+NXSCAT for the
                   additional types, in the same order as originally input
                   by the user. 
            
            OCC  = Occupancy factor
            
          X,Y,Z  = Fractional atomic coordinates
            
             B   = Thermal factor.
                  Note that if B is > 0., then it is assumed to be an
                   isotropic thermal factor. If B is input as 0., then the
                   temperature factor is assumed to be anisotropic with the
                   B11, B22, B33, B12, B13, B23 elements being supplied on
                   the immediately following record. If B is < 0., then the
                   temperature factor is assumed to be isotropic with
                   magnitude = ABS(B), but it will be converted to anisotropic
                   prior to use in the program. 
             
           The following record should be included ONLY if the supplied B 
        value is less than or equal to 0. for the preceeding atom.
        
        5a etc) B11, B22, B33, B12, B13, B23, BRES, SIG     FORMAT(8F10.5)
        
            B11   = 
            
            B22   = 
           		Components of anisotropic thermal factor tensor.
            B33   =	If B (previous record) is < 0., then these fields
                        are irrelevant as the program will compute them
            B12   =  	by converting |B| to anisotropic.
        
            B13   = 
        
            B23   = 
                                                                   
           BRES   =  Possible target value for restraining the isotropic
           	     equivalent of the anisotropic temperature factor. If
           	     BRES > 0., then a restraint term of the form 
                     WT*(BRES-BEQ)**2 is included in the least squares
                     equations.
        
            SIG   =  Sigma for restraint term, used only if BRES is > 0.
                    WT is 1/SIG**2. (Suggested value =0.5) 
                                                                   
                Include cards 5 (and possibly 5a) for each of the NA atoms.
                                                  
        
           ***** END OF INPUT, UNLESS HEAVY ATOM REFINEMENT WAS REQUESTED *****
          
           If "phase refinement" was requested (NOREF=1), then include the 
           following cards.
        
        CARD A)  NPASS, FMCUT, NHVCYL, IWT, IEXC, NFIXP, MAXLIK   (free format)
        
                                   NPASS = # of times protein phases are
                                           to be recomputed, i.e. # of
                                           refinement passes. (max=10).
                                           Protein phases are held fixed
                                           during each pass, and updated
                                           at the end of each pass.
        
                                   FMCUT = Figure of merit cutoff. 
                                          Reflections will not be used in 
                                          phase refinement if the 
                                          associated figure of merit is
                                          < FMCUT.
                                                
                                  NHVCYL = # of refinement cycles
                                           to be performed in each pass. 
                                           (max=50). Each cycle can refine
                                           heavy atom and/or scaling parameters
                                           for any data set.
                              
                               	     IWT = 0 for refinement weights based
                                             on expected lack of closure.
                                         = 1 for refinement weights based
                                             on estimated accuracy of current
                                             protein phase.
                                         = 2 for unit weights.
        
                                    IEXC = 0 to exclude contribution to
                                             protein phase distribution from
                                             each data set when parameters
                                             for that data set are being
                                             refined.
                                         = 1 to include contributions to
                                             protein phase distribution from
                                             all possible data sets during
                                             refinement.
        
                                 NFIXP = 0 for normal operation (uses
                                           protein phases based on current
                                           heavy atom data during refinement).
                                       = 1 to read in externally derived
                                           protein phases, and hold them
                                           fixed during heavy atom refinement.
                                           If NFIXP=1, then IEXC is reset 1,
                                           and IWT is reset to 0 if it was 1.
        
                                MAXLIK = 0 for conventional parameter
                                           refinement.
                                       = 1 for "Maximum Likelihood" parameter
                                           refinement. 
 
       **** The following card should be included ONLY if NFIXP=1 ****
           
        CARD A'   FXDFIL                       (free format)
        
                               FXDFIL    = name of file containing the
                                           protein phases to be held fixed
                                           and used during refinement.
        
        
           The following card set B,C,D must then be repeated for each of the
        NHVCYL cycles requested.    
        
        CARD B)   IVSET                        (free format)
        
                           IVSET    = data set number (in order as 
                                      originally input) of set for which
                                      derivative parameters are to be
                                      refined.
        
            
        CARDS C)  (IVAR(J),J=1,5 or 10)              (free format)
        
                                      Variable selection information
                            IVAR(1) = 1 to refine x coordinate, 0 to hold fixed
                            IVAR(2) = 1 to refine y coordinate, 0 to hold fixed
                            IVAR(3) = 1 to refine z coordinate, 0 to hold fixed
                            IVAR(4) = 1 to refine occupancy, 0 to hold fixed 
                            IVAR(5) = 1 to refine B (or B11), 0 to hold fixed
                            IVAR(6) = 1 to refine B22, 0 to hold fixed
                            IVAR(7) = 1 to refine B33, 0 to hold fixed
                            IVAR(8) = 1 to refine B12, 0 to hold fixed
                            IVAR(9) = 1 to refine B13, 0 to hold fixed
                            IVAR(10)= 1 to refine B23, 0 to hold fixed
        
        Card C must be repeated for as many atoms as are in the specified data 
        set. Each card refers to a single atom, in the same order as 
        originally input. Note that IVAR(6-10) are appropriate only if the 
        corresponding atom was input with (or converted to) an anisotropic 
        temperature factor. 
        
        CARD D)   (IVSCL(I),I=1,3)                      (free format)
        
                           IVSCL(1) = 1 to refine SCLFPH, 0 to hold fixed
        
                           IVSCL(2) = 1 to refine BOVFPH, 0 to hold fixed
        
                           IVSCL(3) = 1 to refine SCLFH, 0 to hold fixed
                         
        Note! For native anomalous scattering data sets, IVSCL(1) and
              IVSCL(2) must be 0
         
                                    ****  FILES  ****
            
           The input "scaled/merged" reflection files have already been
        described. The output protein phase file OUTREF is binary and contains
        records with the following: 
         
            H, K, L, FMFO, FO, PHIBEST, IPRAB, IPRCD, MK, FOM
                                              
            where
            
            H, K, L  =  Miller indices (integers)
            
            FMFO     =  Figure of merit weighted structure factor amplitude
                        (either FOM * FP   or  FOM * F+)
            
            FO       =  Observed structure factor amplitude (either FP  or F+)
            
            PHIBEST  =  Best (centroid) phase, in degrees.
            
            IPRAB       Hendrickson-Lattman coefficients A,B,C,D for the phase
                     =  probability distribution used, packed two per word as 
            IPRCD       (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384  and
                        (IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384
            
            MK       =  Restricted phase indicator.  For general reflections
                        MK=1,  for centric reflections MK > 1 and one of the
                        allowed phase values is (MK-1)*15 degrees (the other
                        possibility is 180 degrees away).
            
            FOM      =  Figure of merit associated with PHIBEST and used for
                        weighting.


           The output files "FILOUT" are "short form" phase files suitable
        for computing difference Fouriers, double difference Fouriers, observed
        difference Pattersons or "calculated" difference Pattersons for each
        data set, via the MAPTYP=1,3,6,7 options, respectively, in FSFOUR. They
        can be used to identify more heavy atom sites, to generate difference
        Pattersons or to generate "calculated difference Pattersons" from the
        input heavy atom model for comparison with the "observed difference
        Pattersons". These files actually contain records with

        IH,IK,IL,FHobs,FHcalc,PHI_Hcalc

        IH,IK,IL,(FP+ - FP-)obs,(FP+ - FP-)calc,(PHI_PRO-90)

        IH,IK,IL,(FPH+ - FPH-)obs,(FPH+ - FPH-)calc,(PHI_PRO-90)

        for isomorphous, native anomalous and derivative anomalous data sets,
        respectively.

                                                                 
           If phase refinement is requested (NOREF=1) and protein phases are to
        be explicitly input (NFIXP=1), then an additional file FXDFIL with the 
        same structure as the output phase file above must also be supplied to 
        provide the protein phase information. If MAXLIK = 0 only the indices,
        PHIBEST and FOM will be used. If MAXLIK = 1 the Hendrickson-Lattman
        coefficients will also be used.
        
            
           In protein phasing mode the program expects to read in one or more  
        "merged" data files, i.e. files with records containing H, K, L, FP, 
        SFP, FD, SFD  for isomorphous replacement data, H, K, L, F+, SF+, F-,
        SF- for native anomalous scattering data or H, K, L, FP, SFP, FPH+,
        SFPH+, FPH-, SFPH- for derivative anomalous scattering data. It is
        assumed that the native and derivative data has already been properly
        scaled together (via CMBISO or CMBANO).  If more than one data set is
        input containing native F values (FP), corresponding FP values are 
        assumed to be identical (on same scale) in each set, as would be the 
        case if each derivative set was scaled to the same native set with 
        CMBISO. It is not necessary for any given reflection to be present in 
        all sets. If more than one data set is supplied, but a reflection is 
        present in only one of them, then the resulting output phase for that 
        reflection will correspond to an SIR (or SAS) calculation rather than 
        MIR. One can however, request that acentric reflection phases be 
        output only if N or more data sets contributed, where N is an input 
        parameter. Thus an N value of 2 would insure that output phases are 
        generated only for cases where the phase ambiguity has been resolved 
        (in principle).  For centric reflections there is no phase ambiguity, 
        hence the N criterion is not applied.  If only one data set is input, 
        then N should be 1 to insure that all computed phases (either SIR or 
        SAS) are output.
            
            
           NOTE!!!! If both NATIVE anomalous scattering and other types of
        data sets are input, THE NATIVE ANOMALOUS SCATTERING SETS SHOULD BE
        THE LAST ONES INPUT. If both anomalous and isomorphous data sets are 
        input then the F and SIG values for the anomalous data should be on 
        the same scale as the isomorphous data. This will happen automatically 
        if CMBISO and CMBANO are used to prepare the data files and the same 
        native set was used as input. If NATIVE anomalous scattering data is 
        to be used IN ADDITION TO OTHER DATA TYPES, then it is convenient to 
        also run it through CMBANO to put it on the scale of the other data, 
        and then edit the output file to strip away the extra FP and Sig(FP) 
        fields. This is needed to conform to the file format for native 
        anomalous scattering sets, yet be properly scaled for consistancy with 
        the other data sets.
            
           If only mutiple anomalous scattering data sets are input, then F 
        values for all sets are assumed to be on the same scale, and the heavy 
        atom parameters should correspond to the same hand, and be consistent 
        with the input indices. 
            
           IT IS ASSUMED THAT WHEN MULTIPLE DATA SETS ARE INPUT, THE ORIGIN
        AND HAND IS CONSISTENT THROUGHOUT ALL DATA SETS.
            
               
             
               **** additional input for SF calculation mode (MODE=1) ****
            
        
        CARD 3 + 3*NXSCAT -    INPREF       (free format)
                  
                               INPREF =     Name of file containing the
                                            input reflections for which 
                                            structure factors will be computed.
                                                
        
        CARD 4 + 3*NXSCAT -   INPCDS       (free format)    
            
                              INPCDS =     Name of file containing the
                                           input atomic coordinates. 
                                               
        
        CARD 5 + 3*NXSCAT -   OUTSF        (free format)
        
                              OUTSF  =     Name for output file containing
                                           the calculated structure factors.
        
        
        CARD 6 + 3*NXSCAT -    KRES,(KILRES(I),I=1,KRES) (free format)
            
                                KRES =   Number of residues to be omitted
                                         from structure factor calculation.
            
                (KILRES(I),I=1,KRES) =   residue numbers for the KRES
                                         residues to be omitted.
                    
        CARD 7 + 3*NXSCAT -     IMODE, IHLCF, ISIGA       (free format)
        
                               IMODE = 0 if atomic type to be derived from
                                         first character of atom name (see
                                         below)
                                     = 1 if atomic type explicitly input
                                         (see below)
                           
                               IHLCF = 0 "Short" Fourier output. File
                                         contains Fobs, Fcalc, phase.
                                     = 1 "Full" Fourier output. File
                                         contains FM*Fobs, Fobs, phase,
                                         Hendrickson-Lattman coefs etc.

                                         NOTE! IHLCF is meaningful only when
                                         ISIGA is zero, as the nature
                                         of the output file is determined
                                         for ISIGA > 0 as described below. 

                               ISIGA = 0 If "full" file output is requested
                                         (IHLCF=1), Bricogne's modification
                                         of Sim's weights are to be used to
                                         construct the phase probability
                                         distributions.
                                     = 1 For "Full" file output but with 
                                         distributions based on Sigma_A
                                         weights. 
                                     = 2 For "short" file output appropriate
                                         for reduced bias difference maps
                                         based on sigma_A weighting (use Fo-FC
                                         option in FSFOUR).
                                     = 3 For "short" file output appropriate
                                         for reduced bias native maps based
                                         on sigma_A weighting (use 2FO-FC
                                         option in FSFOUR).
                                          
        
                                       **** FILES ****
            
        INPREF - Input structure factor file. Several types of files can
        be used here, and the type of file is deduced from the last part
        of the filename. Allowed file types include binary (31 type files,
        either long format or short format), any of the "merged" files,
        "MULISTS", SCALEPACK style files or files in free format.

        If the filename ends with ".31", then a binary style "phased"
        file is assumed, which can be the output from a previous PHASIT
        or BNDRY run. Either long or short format files can be used, and
        the program will figure out which type was input and pick up the
        indices and Fobs values appropriately. The records thus would
        contain either

         h, k, l, FOM*FO, FO, PHIbest, A_B, C_D, MK, FOM     (long format)

                                    or

         h, k, l, FO, FC, PHI                                (short format)

        Note that previous files output from PHASIT, structure factor mode
        with ISIGA > 1 or output in "phasing mode" as a "difference
        coefficient file" are NOT appropriate as they do NOT contain FO
        explicitly. Similarly, long format files output from BNDRY with
        IOTYP=1 are not appropriate as they do not contain FO in the
        second amplitude slot.
 


        If the file name ends with ".MU" or ".mu", then it is assumed to be
        an ASCII "MULIST" i.e. a file generated by program MAKEMU (in the
        XENGEN system) or by program FBSCALE. In that case each record is
        assumed to contain

        H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag

        in format  (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). Only the indices
        and F values will be used.


        If the filename ends with ".SCA" or ".sca", then an ASCII SCALEPACK
        file is assumed. After a variable number of header records (see the
        FILE FORMATS section), reflection records follow and contain

        H, K, L, I+, sig(I+), I-, sig(I-)

        in format (3I4, 4F8.1)

        Note the use of intensities rather than F's. The last two items
        in each record may be omitted. If present, they would be used
        only if I+ was not measured.


        If the filename ends with anything other than ".31", ".MU", ".mu",
        ".SCA" or ".sca", the file is assumed to be ASCII and is read in free
        format. The records are assumed to contain
            
            H, K, L, FO
                                                                         
            where  H, K, L     =  Miller indices (integers)
            
                      FO       =  Observed structure factor amplitude
            
        Note that this is appropriate for any of the "scaled and merged"
        files output by CMBISO or CMBANO, and generic files as well.

            
        INPCDS -  Input atomic coordinate file, ASCII with 
                  format ( 1X, A1, 5X, A1, I3, A4, 5F10.5, I5). Each record
                  should contain
            
            CHN, RT, IRES, ATOM, X, Y, Z, B, OCC, ITYP      
            
        where 

         CHN    = single character chain identifier (not used)

          RT    = single letter amino acid code (not used)
            
        IRES    = sequence number (used only if rejecting residues)
        
        ATOM    = atom name (used only if IMODE=0)
        
        X,Y,Z   = fractional atomic coordinates
            
           B    = Isotropic thermal factor
            
        OCC     = Occupancy factor
        
        ITYP    = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 or 20
                  for C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4,
                  I-, Zn+2, Ca+2, Mg+2, Cd+2, U+6, P, Br-, Cl- or Sm+3,
                  respectively. ITYP = 21 through 20+NXSCAT for the
                  additional types, in the same order as originally input
                  by the user. Note that if IMODE=0, then atomic types are
                  derived from the first character of the atom name, but
                  only C,N,O,S or Fe will be recognized.

                  Include one record of this type for each atom
            
        
        OUTSF -  The output structure factor file differs, depending on the 
                 values of IHLCF and ISIGA.

        If ISIGA = 0 and IHLCF = 0, the file is binary with each record
        containing
            
        H, K, L, FO, FC, PHIcalc
            
        where H, K, L    = Miller indices (integers)
        
                      FO = Observed structure factor amplitude
            
                      FC = Calculated structure factor amplitude (scaled to
                           input set)
            
                PHIcalc  = Calculated phase angle in degrees.
            
        
        If ISIGA =1 (or ISIGA=0 AND IHLCF = 1) the file is binary with each
        record containing
            
        H, K, L, FMFO, FO, PHIcalc, IPRAB, IPRCD, MK, FOM
            
        where
            
        H, K, L  =  Miller indices (integers)
            
        FMFO     =  Figure of merit weighted structure factor amplitude
                    FOM * FO
            
        FO       =  Observed structure factor amplitude FO
            
        PHIcalc  =  Calculated phase, in degrees.
            
                    Hendrickson-Lattman coefficients A,B,C,D for phase
        IPRAB       probability distribution centered on calculated phase,
                 =  packed two per word as 
        IPRCD       (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384  and
                    (IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384
            
        MK       =  Restricted phase indicator.  For general reflections
                    MK=1,  for centric reflections MK > 1 and one of the
                    allowed phase values is (MK-1)*15 degrees (the other
                    possibility is 180 degrees away).
          
        FOM      =  Figure of merit associated with PHIcalc and used for
                    weighting.
        
            Note that this record structure is identical to that produced in 
        protein phasing mode, although the probability distributions will all 
        be unimodal.
       
 
        If ISIGA = 2 the file is binary with each record containing

        H, K, L, FOM*FO, D*FC, PHIcalc

        with the parameters as previously described, and D is as defined in
        Read's Sigma_A procedure. This file is appropriate for reduced bias
        DIFFERENCE maps, and should be used in FSFOUR with the FO-FC option.
            
        If ISIGA = 3 the file is binary with each record containing

        H, K, L, FOM*FO, D*FC, PHIcalc      for acentric reflections
                        and
        H, K, L, FOM*FO/2, 0., PHIcalc      for centric reflections

        with the parameters as previously described, and D is as defined in
        Read's Sigma_A procedure. This file is appropriate for reduced bias
        NATIVE maps, and should be used in FSFOUR with the 2FO-FC option.


           In structure factor calculation mode, a set of reflection indices 
        and observed F values are read in from one file (which can be the 
        output file generated from a previous run of PHASIT or BNDRY). Atomic  
        coordinates, occupancies and thermal parameters are read in from 
        another  file. Structure factors are then computed for all input 
        reflections, and a binary output file is written. Records in the 
        binary file differ depending on which options (IHLCF and ISIGA 
        parameters) were selected. In one case a "short" form of the phase
        file is written, generally containing Fobs, Fcalc and the phase. The
        output structure factor file then is identical (in structure) to that
        produced by MAPINV, thus it can be used in option 3 of the BNDRY
        program to combine phase information from the partial (or complete,
        but tentative) structure with other phase information. If combined
        with an output from PHASIT (protein phasing mode), then SIR, MIR etc
        phases can be combined with those from the model structure. If
        combined with an output from BNDRY, then partial structure phases can
        be combined with MIR, etc phases AFTER density modification. The file
        can also be used directly to compute electron density, difference
        density, "residue deleted" maps etc., based on phases and amplitudes
        computed from the input model. Provisions are available to omit
        various residues from the structure factor calculation, thereby
        facilitating use of the file for computation of "residue deleted"
        electron density maps. 
           If other options are selected, after calculating the structure 
        factors and scaling them to the observed data Hendrickson-Lattman 
        coefficients are also computed, based either on Bricogne's
        modification of Sim's weighting scheme or on Read's Sigma_A procedure.
        The output file then can contain FM*Fobs, Fobs, Phi, HL coefficients,
        restricted phase indicator and figure of merit. In that case the 
        output file structure is identical to that produced by BNDRY, or by
        PHASIT in protein phasing mode. The file can then also be used to
        compute Fourier maps, but conventional DIFFERENCE Fouriers can NOT be 
        computed since the Fcalcs are not present on the file. It can however,
        then be used as the "anchored" phases to which other phase information
        can be "tethered", i.e. replace the MIR phases. It can also be input
        to MISSNG, so that phase extension can be tethered to the partial
        structure phases in subsequent density modification cycles. 
           By invoking other options the file can contain coefficients
        appropriate for "reduced bias" native or difference maps, based on
        Randy Read's Sigma_A procedure.
        
                          **** PHASIT PROGRAM STRUCTURE ****
        
           In protein phasing mode the following events take place.
                                  
        For each data set the program will do the following:
            
        1) Read in all reflections and reject those which fail to pass the
        supplied d and F/SIG cutoff information.
            
        2) The indices of each accepted reflection are transformed (if needed) 
        to correspond to a "standard" asymmetric unit, systematic absences are 
        rejected, and phase restrictions are identified for centric 
        reflections. If the data set contains anomalous scattering data 
        centric reflections are rejected. All other reflections are stored.
            
        3) Heavy atom parameters are read in and structure factors are 
        computed based on the heavy atom positions, using the appropriate 
        scattering factors for isomorphous or anomalous scattering data, 
        respectively.
            
        4) A suitable number of reflections are chosen from which difference 
        magnitudes  ABS(FP - FPH), ABS(F+ - F-) or ABS(FPH+ - FPH-) are used to
        scale the heavy atom structure factors. For isomorphous replacement 
        data all reliable centric reflections are used, if any are present. If 
        there is an insufficient number of centric reflections, the selected 
        list is augmented by the 25% largest differences for acentric data. 
        For anomalous scattering data, the 25% largest differences ABS(F+ - 
        F-) etc are used. If the user input a scale factor, then it is used 
        instead of the computed value. R factors are then reported after 
        scaling the heavy atom structure factors.
            
        5) The data is grouped into ranges based on the magnitude of FO or
        (F+ + F-)/2, and rms E values (lack of closure) are computed for each 
        range. All centric data (possibly augmented with acentric data as 
        described above) are used to determine E values in the isomorphous 
        replacement case. In the anomalous scattering case only the 25% 
        strongest differences are used. For centric isomorphous replacement
        data the input sig(FP) and sig(FPH) values are used to remove from
        the E values the components arising from measurement error, and the
        remaining lack of closure value is halved. The components due to
        measurement error are then added back. This enables the E values
        determined from centric data to be applicable to the acentric data.
        A three term polynomial is then fit by least squares to the rms E
        values as functions of FO or (F+ + F-)/2. If the user input the
        polynomial coefficients, then this step is bypassed.    
        
        6) From the scaled heavy atom structure factors, input amplitudes and 
        computed E values, Hendrickson-Lattman coefficients are computed to 
        represent the SIR (or SAS) phase probability distributions. For the
        centric isomorphous replacement data the E values are first adjusted
        to "undo" the downscaling making them appropriate for acentric data.  
            
        7) SIR (or SAS) phases are then computed by integrating over the 
        distributions to yield "best" phases and the associated figure of 
        merit. Figure of merit statistics are then output, along with an 
        estimate of the "phasing power" ( FH(calc)/E  or  2.*FH"(calc)/E ) as 
        a function of resolution. Note that for the purpose of phasing power 
        calculations E values are based on amplitude differences, whereas for 
        the actual probability distributions E values are based on intensity 
        differences.
            
        8) The indices, observed and calculated amplitudes, input standard
        deviations, Hendrickson-Lattman coefficients, calculated phase 
        components and restricted phase indicators are output to a scratch
        file.
            
            
           After repeating the procedures 1-8 for each data set, phase 
        information from all sets is combined as follows:
            
        9) The scratch files are rewound and read.  The first time unique 
        indices are encountered, they are stored along with FP (or F+), the 
        restricted phase indicator and the Hendrickson-Lattman coefficients. A 
        counter is also saved to keep track of the number of data sets 
        (probability distributions) contributing to each reflection. If the 
        same reflection is encountered again, the Hendrickson-Lattman 
        coefficients are added to those already saved and the counter is 
        incremented.
            
        10) For each unique reflection, the cumulative Hendrickson-Lattman 
        coefficients are used to generate the combined phase probability
        distribution. The distribution is then integrated to yield the "best"
        (centroid) phase and associated figure of merit. The computed phase is
        then saved, and the number of contributing data sets and restricted
        phase indicator are examined. If the reflection is acentric, the number
        of data sets contributing to that particular distribution is compared 
        to N (input value) to decide whether or not to output the reflection.
            
        11) The indices, figure of merit weighted FP (or F+), FP (or F+), best 
        phase, Hendrickson-Lattman coefficients (for combined distribution), 
        restricted phase indicator and figure of merit are then output for 
        each centric reflection and for those acentric reflections passing the 
        "N" criteria. Figure of merit statistics are then output for the final 
        phase set. A "difference Fourier coefficient" file is also written
        for each data set enabling one to search for additional sites, or to
        compare Pattersons "calculated" from the input sites with the
        "observed" difference Pattersons. Both difference maps (showing all
        heavy atom sites) and "double difference maps" (after subtracting
        out the input heavy atoms) can be computed with the same file, as
        can the "observed" and "calculated" difference Pattersons.
        
        12) If more than one data set was input, the scratch files are then 
        rewound and read again to recompute the "phasing power" and "bias" for 
        each data set. This time however, the phasing power calculations are 
        based on lack of closure values obtained using the new protein phases.
        In theory, for data sets containing only small errors, the phasing
        power for each data set should increase relative to its initial value
        if the multiple data sets are consistantly resolving the phase 
        ambiguity. Large decreases indicate an inconsistant derivative or lack
        of isomorphism beyond a given d spacing, and generally result from 
        incorrect signs of many isomorphous or anomalous scattering 
        differences. Usually there will be small decreases observed when
        more than 2 or 3 data sets are used. This means that some of the signs 
        of delta F are inconsistant and is unavoidable with experimental data. 
        Also, the phasing power is essentially the "signal to noise ratio" for 
        each data set, thus when it falls below 1.00 the data probably does 
        more harm than good. A good policy is to truncate each data set at the 
        resolution where the phasing power falls to about 1.00. The "mean 
        relative error" M.R.E., defined as (1/N) * SUM (e(phi)**2 / 2.*E**2))
        where e(phi)**2 is the lack of closure, weighted over all possible
        protein phases for each reflection is also output for each data set,
        and should be about 0.5 if the E's are properly determined. In 
        addition, the mean phase "bias" toward heavy atom phases is listed
        both as a function of resolution, and overall for each data set. Since
        there should be no correlation between true protein and heavy atom
        phases, the mean bias should be 90 degrees for each data set. If it
        deviates significantly from 90 degrees, one (or possibly more
        correlated) data set(s) is/are likely to be dominating the phasing
        process, and biasing the results.    
            
        13) If more than one data set was input and derivative parameters are
        NOT being refined (NOREF=0), the program then starts a second cycle by
        updating the E value polynomial coeficients for each set as before,
        but this time using probability weighted averages over all possible
        protein phase values for each reflection. The updated E values are then
        used to recompute Hendrickson-Lattman coefficients for each set. New SIR
        or SAS phases are then computed and Figure of merit statistics are
        listed for each set separately. The results are then written to new
        scratch files. Steps 9-12 are then repeated to produce and evaluate new 
        combined distributions. Statistics are given as before, but this time 
        the mean absolute phase shift (in degrees) from the previous cycle is 
        output as well. Only the results of this final cycle will appear on 
        the output phase and difference coefficients files. This recycling
        procedure generally improves results since phases are based on what are
        normally more accurate E values. This is especially true for the
        anomalous scattering data sets, since the original E's were estimated
        from a small subset of data based on crude (though reasonable)
        statistical arguments. The program then terminates. 
        
        14) If atomic or scaling parameters ARE being refined (NOREF=1), for 
        each data set a check is made to determine whether E value polynomial 
        coeficients have been updated yet for it (as for example, in a previous
        run). If not, new coefficients are determined as in step 13, and new 
        SIR or SAS phases are computed based on them. If the E coefficients 
        are updated for ANY set, then all sets are combined again to determine 
        new protein phases and statistics as before. Once updated polynomial 
        coefficients are available for each set, and protein phase estimates 
        have been obtained based on them, refinement of parameters then       
        proceeds.
        
           The program loops over each set to be refined as follows:
               
           If externally derived protein phases are to be used (NFIXP=1), the
        indices, phases, FOM'S (and distribution coefficients if maximum
        likelihood refinement is requested) are read in and stored. Otherwise,
        protein phases and figures of merit are recomputed using contributions 
        to the combined phase probability distributions from either ALL data 
        sets, or from all EXCLUDING the set currently being refined, as 
        indicated by the user supplied parameter IEXC. For the set being 
        refined, heavy atom structure factors and derivatives are then 
        computed, and FPH(calc) (or FPH+(calc), FPH-(calc)) and its 
        derivatives with respect to the variable parameters are computed, 
        using the selected protein phases. Contributions to the Cullis and 
        Kraut R factors are then accumulated. If the current figure of merit 
        exceeds the input cutoff, the derivatives are included in the buildup 
        of least squares equations minimizing the weighted lack of closure 
        with respect to the selected variable parameters. If MAXLIK=0 the
        quantity minimized is
        SUM [ W*(|FPH|(obs) - |FPH|(calc))**2]      for isomorphous or
        SUM [ W*((|FPH+|(obs)-|FPH-|(obs)) - (|FPH+|(calc)-|FPH-|(calc)) )**2 ]
        for anomalous scattering data sets, respectively, where W is 1./E**2,
        1./E'**2 (E' is the RMS E value (based on amplitudes) only for the
        contributing data sets), or unity as selected by the user via the
        parameter IWT. If MAXLIK=1, instead of computing |FPH|(calc) at the
        single value of phi(Protein)=phi(best), the equations above are 
        modified to include contributions from all possible values of
        phi(Protein), with each suitably weighted by the probability associated
        with phi(Protein). Thus in the isomorphous case the quantity minimized
        becomes
        SUM [ W * SUM [  P(i) * (|FPH|(obs) - |FPH|(calc,i))**2 ] ]
        where P(i) is the probability for phi(Protein) used in the calculation
        of |FPH|(calc,i), and P(i) is stepped over the phase circle in 5 degree
        increments. A similar expression is used in the anomalous case. 
        The least squares equations are solved by matrix inversion, and the
        parameters are then updated. The following R factors are reported.
            
            R Cullis =  SUM | ||FPH|(obs) +/- |FP|(obs)| - |FH|(calc) |  
                        ----------------------------------------------
                              SUM | |FPH|(obs) +/- |FP|(obs)|      
           
            with the sum taken over all centric reflections.
        
        
            R Kraut =   SUM | |FPH|(obs) - |FPH|(calc) |
                        ---------------------------------
                                SUM |FPH|(obs)
        
        with the sum taken over all acentric reflections (isomorphous case).    
        
            
        
        R Kraut = SUM ||FPH+|(obs)-|FPH+|(calc)| + ||FPH-|(obs)-|FPH-|(calc)|
                  -----------------------------------------------------------
                                SUM |FPH+|(obs) + |FPH-|(obs)
        
        with the sum taken over all acentric reflections (anomalous case).
        
        
        
           After NHVCYL refinement cycles, the heavy atom structure factors 
        and R factors are recomputed based on the new parameters. Steps 6-12 
        are then repeated to generate new protein phases, and the E values are 
        updated as in step 13. The whole process is repeated for each of the 
        NPASS passes requested. After each pass, the mean absolute phase shift 
        over all reflections is output. After the last pass, the protein phase 
        and difference coefficients files are written, and a new file
        NEWPARAMS.INP is created, which is a copy of the original input deck
        except that the new heavy atom parameters, scale and E coefficients
        replace the original ones. This deck can be used for further refinement
        in a subsequent job. Note that within a pass, protein phases are held
        fixed (except for possible removal of contributions from the derivative
        being refined). They are updated only after the end of each pass, and
        even then, only if externally derived phases are NOT being used.
        
                         ***** NOTES ON PHASE REFINEMENT *****
        
           During phase refinement, one generally excludes contributions to the
        protein phase probability distributions from the data set for which
        parameters are being refined (IEXC = 0). This is because the assumption
        is that the protein phases and heavy atom parameters are independent,
        which will not be true if the derivative contributed to the protein
        phases. Indeed, it may not be strictly true even if contributions to
        protein phases are omitted from the derivative, if it has heavy atom
        sites in common with another derivative that IS contributing. On the
        other hand, successful phase refinement of parameters depends on
        REASONABLY ACCURATE protein phases being available. This presents a
        problem when only a few derivatives are to be used. If protein phase
        contributions come from only one derivative (the one not being
        refined), then the protein phases are very poorly determined as they
        are actually SIR phases. Phase refinement then usually results in
        reduction of the FH scale factor and most occupancies. The end result
        is a degradation of most all statistical indicators, but little or no
        change in the figure of merit. In this case it may be desirable to
        ignore the correlation, and include all contributions to the protein
        phase (IEXC=1), which results in stable, although slow refinement. In
        that case the expected improvement is usually obtained, but the bias
        toward heavy atom phases may be slightly larger than desired. It is
        sometimes useful to do this even with 3 or more derivatives.
        
           Also, note that the R Cullis and R Kraut values are dependent on 
        the current protein phases. Thus if contributions from the set being 
        refined are excluded, these factors will generally increase as they do 
        not reflect the final protein phases, but only the phases in use at 
        the time they were computed. For this reason, it is always desirable 
        to include all possible contributions (IEXC=1) at least in the last 
        cycle, just to get the final Cullis and Kraut R factors which 
        correspond to the MIR phases for publication purposes. The parameter 
        shifts need not be used.
        
           It is often desirable to read in externally derived protein
        phases, and hold them fixed for use in heavy atom parameter refinement.
        This could be the case, for example, if the initial parameters are 
        poorly determined, but a "solvent flattened" and/or "symmetry 
        averaged" map looks reasonable. In that case, protein phases obtained 
        from the map (and possibly combined with the original phases) might be 
        better suited for parameter refinement than the original phases were. 
        These "EXTERNAL" phases can be input and used during parameter 
        refinement (NFIXP=1). In that case, the program still computes new 
        protein phases after each refinement pass for the purpose of updating 
        statistics, E values and final output, but the phases which were input 
        are ALWAYS used UNCHANGED during every refinement cycle. The output 
        phases however, will always correspond to those computed from the 
        current heavy atom parameters, and can be used to start a new round of 
        solvent flattening. IT IS STRONGLY SUGGESTED that one always do at
        least one round of refinement against solvent flattened phases in
        this manner, AND USE THE NEW PARAMETERS TO INITIATE A FINAL ROUND OF
        SOLVENT FLATTENING! 
                 
           An important aspect of phase refinement is that it enables 
        refinement of the derivative to native scaling parameters. These 
        parameters should initially be 1. and 0. for SCLFPH and BOVFPH, as 
        CMBISO or CMBANO has equated the scattering from native and derivative 
        data sets. While this is adequate for initial heavy atom determination,
        it can not be strictly correct as the presence of the additional heavy
        atoms MUST increase the scattering for the derivative crystal relative
        to that from the native. Thus refinement of the FPH scale factor should
        increase it to slightly more than unity, the exact value being limited
        by the composition of the native and derivative crystals. If the FPH
        scale factor falls below unity, it can not correspond to reality.
        There is however, no restriction on the BPH scale factor (which is
        actually a delta B, between the native and derivative data sets). Since
        the data sets have already been "thermally" scaled (in CMBISO or 
        CMBANO), refinement of BPH generally results only in small shifts,
        which can be positive or negative. Also, note that all changes in the
        derivative scaling parameters are TEMPORARILY applied internally in the
        program. The input "merged" data files for each set are NOT modified in
        any manner, and still correspond to the scaling applied in CMBISO or
        CMBANO. The cross-phase Fourier coeficient generating programs MRGDF
        and MRGBDF can apply the additional scaling parameters, if desired,
        for the purpose of generating difference or cross difference Fouriers
        which reflect the new scaling parameters. Also, note that in principle
        one can refine both the derivative FPH and FH scale factors
        simultaneously, but since they are correlated, in practice this
        sometimes leads to poor results. This is particularly the case with
        derivative anomalous scattering data. In that case, it may be best to
        refine only one of these two parameters in any given cycle, and 
        alternate refinement of them between cycles. Refinement of the native-
        derivative scale factors works best when initiated against FIXED
        EXTERNAL PHASES (e.g. solvent flattened and/or NC symmetry averaged).   

           For maximum likelihood phase refinement one has considerably
        more flexibility in the weights and in the figure of merit cutoff.
        Since the contributions will be weighted by their probabilities
        anyway, one can greatly reduce the figure of merit cutoff, perhaps
        even to include all reflections. It might also be useful to then
        refine with the "exterior" weights unity (IWT=2) so that the
        probabilities will be the only weights applied. During maximum
        likelihood refinement there is no need to exclude contributions from
        the derivative being refined. Note that maximum likelihood refinement
        can also be done with external phases (NFIXP=1). In the program,
        although contributions to the matrix (and hence the parameter shifts)
        come from all points on the probability distribution, for statistical
        purposes the R factors are still reported only while assuming
        phi(protein) = phi(best).
         
        
           In structure factor calculation mode the following events take 
        place:
            
        1) All atomic parameters are read and checked to insure that the atom 
        type is recognized, and that enough storage exists to do the 
        calculation. If any residues were targeted for rejection, atoms in the 
        residue have their scattering factors set to zero to effectively 
        eliminate them from the input list. 
            
        2) Each reflection is read in and the corresponding structure factor 
        is computed based on the atomic parameters input. The indices, FO, FC 
        and phase are stored, and sums for the least squares calculation of a 
        scale factor and for computation of the correlation coefficient 
        between observed and calculated amplitudes are incremented.
            
        3) After all structure factors are generated, the scale factor 
        relating FO to FC is computed, all FC's are rescaled and an R factor 
        (based on F) and correlation coefficient are computed.
            
        4) The R factor, correlation coefficient and number of reflections
        processed is listed.
        
        5) If both IHLCF=0 and ISIGA=0, the indices, FO, scaled FC and
        phase are output for each reflection and the program terminates.
        
        6) If IHLCF=1 and ISIGA=0, the data are sorted, and mean values of
        abs(Fo**2 - Fc**2) in various resolution shells are computed. A three
        term polynomial is then fit to the delta data as a function of
        resolution.
        
        For each reflection, the indices (and phase) are converted, if
        needed, to the "standard" asymmetric unit, and the expected value of 
        abs (Fo**2 - Fc**2) is obtained from the polynomial and is used to 
        compute Hendrickson-Lattman coefficients for the reflection using 
        Bricogne's modification of Sim's weighting scheme, i.e. 
        
        W = 2 * FO * FC / < | FO**2 - FC**2 | >
                                               sin(theta)/lambda
        
        A = W * COS (Phi calc)
        
        B = W * SIN (Phi calc)
        
        C = 0
        
        D = 0
        
        The distributions are evaluated (to get the figures of merit), and
        the indices, Fm*Fo, Fo, Phi, Hendrickson-Lattman coefs, restricted
        phase indicator, and Fm are written to the output file. A sum to 
        compute the mean figure of merit is also updated. The mean figure of
        merit is listed, and the program terminates.  
       
        7) If ISIGA > 0 the indices and phases are transformed to the
        standard asymmetric unit, the data are sorted on resolution, and are
        converted to normalized structure factors. Both sigma_A and "D" 
        values are then computed for each shell as described by Read.
        Distribution coefficients are then computed as described above except
        that

        W = 2. * Sigma_A * Eobs * Ecal / (1. - Sigma_A**2) for acentric data

        W =      Sigma_A * Eobs * Ecal / (1. - Sigma_A**2) for centric data 

        The distributions are then evaluated to get the figure of merit, and
        coefficients appropriate for conventional electron density, reduced 
        bias native or reduced bias difference maps are written to the output
        file as requested. The mean figure of merit is then reported.   

           Note that the options (IHLCF=1 and ISIGA=0,) or ISIGA=1 are very
        useful if one wishes to "solvent flatten" or "average" a map which is
        obtained from a model, i.e. a molecular replacement solution, since
        it provides an "MIR like" phase file which can be used to "tether"
        subsequent phase information to (via BNDRY, option 3), while the
        other options are useful for direct examination of maps or to provide
        model based phases for phase combination with MIR like information.