Data File Specification for Program GPRLSA
The original version of GPRLSA, called PROLSQ, was written by Wayne
Hendrickson (1978). The current program version, GPRLSA, has been
modified by W. Furey (University of Pittsburgh). The writeup is presented
in cooperation with Star Technologies, Inc.
Input files to GPRLSA are described as follows:
Unit Description
ISYSR = 5 Control Cards
IATMR = 10 Unformatted output file from PROTIN containing atomic
coordinates, distances and codes, planar groups, etc.
This file can be used for several cycles, but it should be
regenerated periodically to update the nonbonded contact list
or when converting from overall to individual thermal factors.
IREFR = 20 Observed reflection data and scattering factors. This
unformatted file was created by HKSCAT.
The file contains records of the form:
H, K, L, Fob, sigma F, sin theta/lamda, ( FII( N ), N = 1,8), Afix, Bfix
-------
| Afix, Bfix normally zero, but fixed
packed in atom contributions to Fcalc (on an
one word absolute scale) can be added here i.e.
hydrogens, anomalous dispersion
correction etc.
where:
FII(1) = Scattering factor of Carbon at corresponding sin theta/lamda value.
FII(2) = " " " Nitrogen " " " " " .
FII(3) = " " " Oxygen " " " " " .
FII(4) = " " " Sulphur " " " " " .
FII(5) = " " " Iron (++) " " " " " .
FII(6) = " " " Hydrogen " " " " " .
FII(7) = " " " Zinc (++) " " " " " .
FII(8) = " " " Calcium (++) " " " " " .
This file will not change unless new reflection data is obtained or
fixed atom contributions are to be included or updated by another
program.
ISHFTR = 15 Shifts in refined parameters obtained from previous cycles.
(For each cycle the output file from the previous cycle should
be used as input here). The file contains accumulated shifts
from all cycles since the last run of PROTIN. It is an
unformatted file. Read only if JABN (card 10.1) > 0.
Output files generated by GPRLSA are described as follows.
Unit Description
ISYSW = 6 Messages and results.
IDISK = 3 Scratch file. The program stores the structure factor
data of the accepted reflections in this file. The file
consists of unformatted records, each containing:
H, K, L, 1., Fobs, sigma F, sin theta/lambda,
(FII( I ), I = 1, 8), Afix, Bfix
H, K, and L are stored in a vector, H(N) where N = 1 to 4.
H( 4 ) is always equal to 1., as indicated above.
(Historically the program could accept up to 18 scale
factors and this value, H( 4 ), would indicate which one
to use for the present reflection. The program now accepts
only one scale factor SC = SC (1) thus H(4) = 1. )
JDISK = 4 Scratch file, unformatted and same information as in file
IDISK but for the NSAMPL reflections selected for RTEST
only. Not used if IRTEST = 0 (card 11).
ISHFTW = 16 Contains parameter shifts from current cycle plus shifts from
previous cycles, to be used as input (file ISHFTR) in the next
cycle.
IFOFC = 31 Optional - generated only if REPORT (card 2) > 0. Contains
structure factors calculated from current model. This
file is formatted with records consisting of:
H, K, L, Fob, Fcalc, phi (phi in degrees)
in format (3I4, 2F10.2, F7.2)
IXYZB = 32 Optional - generated only if REPORT (card 2) > 0. Contains
atomic parameters for the current model i.e. all shifts from
previous cycles are first applied. Records contain:
I, ATOM(I), X(I), Y(I), Z(I), BETA(I), Q(I)
in format (I5, 1X, A8, 3F9.4, 2F6.2)
x, y, z in Angstroms along cell edges.
File ISYSR, unit 5 - Control Information
The following section describes the input control card data read
from file ISYSR. Card numbers are enclosed in parenthesis.
(1) Title FORMAT ( A )
Title to be printed in output listing (only first 40
characters are printed).
(2) General Parameters FORMAT ( 16I5 )
1- 5 NCYCCG Number of conjugate gradient iterations performed in
subroutine CGSOLV. If less than or equal to 0 the
program sets NCYCCG = 50.
6-10 LISTF = 0 Do not print calculated structure factors.
= 1 Print the complete list.
= N with 1 < N < number of observations: Print
a list of N structure factors selected at
intervals of ( NOBS / N ).
11-15 LISTA = 0 Do not print the list of parameter shifts.
= 1 Print the complete list of parameter shifts.
> 1 Print parameter shifts for atoms having a total
positional shift exceeding 0.01*LISTA angstroms.
16-20 LGX = 0 Do not constrain origin on corresponding axis.
21-25 LGY = 1 Constrain, (for polar space groups).
26-30 LGZ
31-35 LQ Associated with scaling Lagrange multiplier elements
for constraining origin on polar axes. If LQ =< 0
the program sets LQ = 10.
36-40 REPORT = 0 normal refinement run - Do NOT write special
output files (coordinates and structure
factors). Printing of structure factors, as
determined by LISTF, is independent of this
parameter.
= 1 refinement run - Also write current coordinates
and calculated structure factors in files
IXYZB = 32 and IFOFC = 31 respectively. (Note
coordinates and phases are for model INPUT to
this cycle, i.e. apply all previous shifts prior
to calculation).
= 2 skip refinement. Calculate structure factors
based on current model (i.e. apply all previous
shifts prior to calculation), and write coordinate
and structure factor files as with REPORT=1.
41-45 IDALIZ = 0 Idealization plus structure factor refinement.
= 1 Idealization only. WARNING: Selecting this
option causes NOCC and ITEMP (see card (3)) to be
set to zero, even though you may not notice it!
PDEL (card 8.4) should be set at about 0.10 to
assure good conjugate - gradient behavior.
46-50 INCFIX = 0 for normal operation.
= 1 to include fixed atom contributions to structure
factors for each reflection.
51-55 NFCYCL = Number of refinement cycles (if REPORT=2 this is
automatically set to 0, if REPORT=1 it is set to 1),
default=1. See notes at end of writeup if NFCYCL > 1.
56-60 ITABLE = 0 Uses table lookup of trig functions in structure
factor calculation (default).
= 1 Evaluates trig functions explicitly (job takes
twice as much time, but slightly more accurate).
(3) Sequence Information FORMAT ( 16I5 )
NOTE: The values required in the first 9 fields (columns 1-45 below)
were printed by the "PROTIN" run that prepared input file IATMR.
1- 5 NA No. of atoms in asymmetric unit
6-10 NDIS No. of distances to be restrained.
11-15 NPLN No. of planar groups in input file.
16-20 NCHR No. of chiral centers in input file.
21-25 NVDW No. of possible contacts in input file.
26-30 NTOR No. of conformational torsion angles.
31-35 NSYM1 No. of symmetry equivalencies for type 1 symmetries.
36-40 NSYM2 No. of symmetry equivalencies for type 2 symmetries.
41-45 NOCC No. of variable occupancy factors.
46-50 ITEMP = 0 use an overall thermal factor.
= 1 use individual thermal factors for each atom.
(4) Positional Parameters to FREEZE FORMAT ( 16I5 )
NKILL No. of atoms for which coordinates will be held fixed.
(up to 500 values)
(KIAT(I), I = 1,NKILL) Atom number, in input list, of the
atom which is to have its coordinates
"frozen" (example: metal atom of prosthetic group)
Probably its position is well determined and no
further refinement is desired.
(5) Unit Cell Constants FORMAT ( 10F8.3 )
a, b, c (in Angstroms), alpha, beta, gamma (in degrees)
(5a) FORMAT ( 26I3 )
Col 1-3 KILRES = The number of residues to be omitted from
structure factor calculation. Note that this does
not freeze atoms like NKILL does, but it removes
structure factor contributions from the least
squares equations. Since restraint contributions
are still included, it can be used to "idealize"
selected troublesome residues. It can also be used
to generate Fourier coeficients for "residue-deleted"
electron density maps. Maximum = 100.
Col 4-6 KRES( I ), I = 1, KILRES
7-9 KRES( I ) = the residue numbers of residues to be omitted.
etc.
(6) Reflection File Information FORMAT ( I10, 4F10.6, I10 )
1-10 NOBS = Maximum number of reflections in input file IREFR
(from HKSCAT output).
1-20 FMIN = Lower cut-off value for F. Reflections with
F < FMIN will be rejected.
21-30 SMIN = (sin theta/lambda) min reflections outside this
31-40 SMAX = (sin theta/lambda) max range will be ignored.
41-50 SIGMIN Such that if F < SIGMIN * sigma-F the reflection
is rejected.
(6a) FORMAT ( I5 ) Statistical resolution breakdown information.
Col 1-5 NI = 0 if d min default values of 5.0, 3.0, 2.5, 2.0, 1.8,
1.5 and 1.3 are to be used in resolution breakdown
segment of program.
= 1 if defaults are to be overridden.
NOTE! Include card 7 only if NI = 1.
(7) Number and Limits of Shells for Statistics FORMAT ( I5, 15F5.2 )
1-5 N = Number of shells in which to subdivide data for
statistical analysis.
DMIN( I ), I = 1, N Resolution limits (d spacings) of the N shells
for statistical analysis.
(8) Weighting Information
8.1) FORMAT (I8, 2F8.3, 8X, 6F8.3)
1-8 KFWGT Determines the weighting scheme applied to the
structure factors during refinement.
= 1 then SIGAPP = SIGDEL
= 2 then SIGAPP = max of (sigma-F, SIGDEL)
= 3 then SIGAPP = sigma-F as given in input file
= 4 the SIGAPP = max of (( SIGDEL * sigma-F / < sigma-F >),
sigma-F )
where:
1/SIGAPP**2 = structure factor weight used by the program
for refinement and statistics.
SIGDEL = AFSIG + BFSIG * (sin theta/lambda -0.1666667)
9-16 AFSIG Independent term and coefficient of (sin theta/lambda)
17-24 BFSIG term for structure factor weighting scheme as specified
by KFWGT.
25-32 Blank.
33-40 WDSKAL = Overall weight for distance restraints (usually 1.)
41-48 SIGD1 = Estimated standard deviations for distances in various
49-56 SIGD2 classes. Note actual weight used is
57-64 SIGD3
65-72 SIGD4 ( WDSKAL/SIGD(i) )**2
73-80 SIGD5
Where SIGD(i) corresponds to SIGD1, SIGD2, . . . SIGD5
according to the kind of distance that a given atom pair
determines in the code established in program PROTIN
(as set by KDWT).
= 1 for bonded distances (1 (1) or 1 (3))
= 2 for angle distance (2 (2) or 2 (4))
(i) = 3 for planar 1-4 distance (e.g., carbonyl O in
residue J-1 to C alpha in residue J).
= 4 for special input rest. dist., H-bond, etc. (4 (4))
= 5 not used.
(8.1a) FORMAT ( 10F8.3 ) NOTE!! Include this card only if KFWGT =4.
Col 1- 8 AVSIGF (1) = The mean ESD's for reflections in each
Col 9-16 AVSIGF (2) resolution range defined by dmin on card
6a or 7. (These can be obtained from a
previous run of GPRLSA).
etc.
(8.2) FORMAT ( 10F8.3 )
1- 8 WPSKAL Overall weight for planar group restraints (usually 1.)
9-16 SIGP Estimated standard deviation from planarity. Actual weight
used is
( WPSKAL/SIGP )**2
17-24 WCSKAL Overall weight for chiral group restraints (usually 1.)
25-32 SIGC Estimated standard deviation for chiral volume. Actual
weight used is
( WCSKAL/SIGC )**2
33-40 WBSKAL Overall weight for thermal factor restraints (usually 1.),
used only if ITEMP (card 3) =1.
41-48 SIGB1 Estimated standard deviations from equality for thermal
49-56 SIGB2 factors of atom pair related by various distance types.
57-64 SIGB3 Actual weight used is
65-72 SIGB4 ( WBSKAL/SIGB(i) )**2
73-80 SIGB5 Where SIGB(i) corresponds to SIGB1, SIGB2 ... SIGB5,
according to the second distance kind code (KBWT) set
in PROTIN (input 4):
|-------------------------------|
|SIGB(i) Distance Code |
|with i = PROTIN |
|-------------------------------|---------------------------------------|
| 1 1(1) | bonded atoms of backbone only. |
|-------------------------------|---------------------------------------|
| 2 2(2) | non-bonded main chain atoms. |
|-------------------------------|---------------------------------------|
| 3 1(3) | bonded atoms - side chain |
|-------------------------------|---------------------------------------|
| 4 2(4) or 4(4) | non-bonded atoms - side chain |
| | or special input bonds |
|-------------------------------|---------------------------------------|
| 5 | not used |
|-------------------------------|---------------------------------------|
(8.3) FORMAT ( 10F8.3 )
1- 8 WVSKAL Overall weight for VDW contact restraints (usually 1.)
9-16 SIGV Estimated standard deviation from ideal contact distance.
Actual weight used is
( WVSKAL / SIGV**2 ) ** 2
17-24 DINC(1) These parameters give the possibility of modifying the
minimum "theoretical" van der Waals contact distance as
25-32 DINC(2) computed by the PROTIN program and contained in input
file IATMR according to the value of KTYP (in input (8)
33-40 DINC(3) to PROTIN). Repulsion terms are added only if two
atoms are less than d(ideal VDW) + DINC(I) angstroms appart,
thus DINC values are usually negative to allow some flexibility
in VDW contact distances. Note attractive terms are NEVER
included.
DINC (1): For atoms with relative position determined
by only one torsion angle: single-torsion
contact.
DINC (2): Two or more torsion angles determine the
relative position of the two atoms involved:
multiple-torsion contact.
DINC (3): Possible hydrogen bond. (Contacts between
any nitrogen or oxygen atom with another
nitrogen or oxygen atom but not
N main with N main or O main with O main).
41-48 WTSKAL Overall weight for torsion angle restraints (usually 1.)
Estimated standard deviations (in degrees) from "ideal"
torsion angles for various torsion classes are now supplied.
Actual weight used is
( WTSKAL/SIGT(i) )**2
49-56 SIGT1 Sigma associated with a prespecified angle (usually phi
and psi of a regular secondary structure).
57-64 SIGT2 Sigma associated with a planar angle (e.g. omega)
65-72 SIGT3 Sigma associated with a staggered angle (e.g. chi 1 )
73-80 SIGT4 Sigma associated with an orthonormal angle
(e.g. chi 2 of aromatics).
(8.4) FORMAT ( 10F8.3 )
1- 8 PDEL Positional shift magnitude restraint (in angstroms). This
parameter restricts the shift magnitudes and stabilizes
ill-conditioned refinement problems. It should be used
if the problem is severely underdetermined. If zero, no
restrictions are imposed.
9-16 BDEL Shift magnitude restraint on individual thermal factors.
(in angstroms**2), used only if ITEMP (card 3) =1. Functions
similar to PDEL.
17-24 QDEL Shift magnitude restraint on variable occupancy factors.
Used only if NOCC (card 3) > 0. Functions similar to PDEL.
25-32 WSSKAL Overall weight for non-crystallographic symmetry restraints
(usually 1.)
Estimated standard deviations from superposability for
various classes of atomic parameters and restraint types
are now supplied. Actual weight used is
( WSSKAL/SIGSP(i) )**2 or
( WSSKAL/SIGSB(i) )**2
33-40 SIGSP1 Sigmas associated with positional restraints (in angstroms).
41-48 SIGSP2 (1,2,3) are for (loose, medium, tight) restraints,
49-56 SIGSP3 respectively.
57-64 SIGSB1 Sigmas associated with thermal factor restraints (in
65-72 SIGSB2 angstroms**2). (1,2,3) are for (loose, medium, tight
73-80 SIGSB3 respectively).
(9) Overall Temperature Factor and Scale. FORMAT ( F8.3, I8, 8F8.3 )
1- 8 TO Overall temperature factor for present cycle. To be
added to individual B's in input file IATMR or used as
is if ITEMP (card 3) = 0.
9-16 NQ Historical, always = 1 (It was used to allow for more
than one overall scale factor. The current version of
the program is not able to use them).
17-24 SC(1) Overall scale factor. Used in subroutine GCALC for
calculated structure factors.
Defined such that Fobs = SC * Fcalc(absolute)
(10) Shift Damping Factors:
(10.1) Positional Damping Factors FORMAT ( I5, 15F5.2 )
JABN Number of damping factors to be read from this card.
(equal to the number of shift records in file ISHFTR).
Usually this is equal to the number of cycles that have been
run previously since the last run of PROTIN. The program
expects at least JABN * NV parameter shifts in file ISHFTR,
where NV is the total number of variables.
NV = 3 * NA + NOCC + 2 if ITEMP = 0
NV = 4 * NA + NOCC + 1 if ITEMP = 1
NA is the number of atoms and NOCC is the number
of variable occupancy factors. One record of NV values
is written per cycle of refinement. Maximum JABN=15.
DAMP( I ), I = 1, JABN. Damping factors to be applied to the
coordinate shifts of cycle I to obtain the current
refined coordinates from the starting coordinates
in file IATMR. DAMP(I) values of 1.0 apply full shift.
When running the first cycle: JABN = 0 and the parameter
shifts (file ISHFTR) are not read in.
(10.2) Individual Temperature Factor Damping Factors
FORMAT ( 5X, 15F5.2 )
Include this card only if ITEMP (card 3) =1
DAMB(I) I = 1, JABN: Similar to DAMP(I) in card 10.1 but for
individual thermal factor shifts. (Note that the program
expects the same value of ITEMP for all cycles. Changing
ITEMP to 1 requires outputing the current coordinates,
rerunning PROTIN and starting a new refinement series).
Thermal factors < 2.0 are reset to 2.0
(10.3) Occupancy Factor Damping Factors FORMAT ( 5X, 15F5.2 )
Include this card only if NOCC (card 3) > 0.
DAMQ(I), I = 1 JABN Similar to DAMP(I) in card 10.1, but
for variable occupancy factor shifts.
If DAMQ(I) > 0, The thermal shifts from the Ith cycle
are not applied to those atoms which
have variable occupancy factors, unless
the occupancy factor is 1. and the shift
indicates it should increase. In that case
the occupancy shift is not applied.
Permissible occupancy factor range is 0.01 < Q < 1.00
(10.4a) FORMAT ( I3 )
NSYMM = Number of equivalent positions in the space group.
(For centered cells include equivalent positions associated
with one lattice point only, i.e. for sp. gp. C2 NSYMM=2)
(10.4b) Free FORMAT
Include NSYMM symmetry cards with one equivalent position per
card, EXACTLY as in the International Tables. First card should
always be x,y,z.
Example: For Space Group P2 sub 1
CARD 1: X,Y,Z
CARD 2: -X, 1/2+Y, -Z
NOTE! Card 11 is needed only if REPORT (card 2) is not = 2
(11) R Test on Sample Set FORMAT ( 4I5, 10F5.2 )
IRTEST = 0 Do not carry R test on sample set.
> 0 Perform R test on sample set. (on parameter
shifts computed in the current cycle)
NSAMPL Number of reflections that program should select
for R test.
JAPN, JABN, SHFTK(I), I=1, JABN See below
The purpose of the R test is to find out the effect that different
damping factors applied to the newly computed parameter shifts would
have. The way it is done, is by selecting NSAMPL reflections from the
set of accepted ones;
Then, for each of the first JAPN cycles, a damping factor =
SUM SHFTK(i) is applied to the new coordinate shifts only, and with
the current temperature factors, the values of
SUM |Fob - Fcalc|
R = ----------------------- ,
SUM Fob
and "scale" = SUM(Fcalc * Fob)/SUM (Fcalc) ** 2, (total and in
different ranges of sin theta/lambda) are computed. (SHFTK(1)
applied in the first cycle, SHFTK(1) + SHFTK(2) applied in the
second, etc. )
JABN Total number of damping factor cycles (greater or equal to JAPN).
This parameter indirectly determines the number of damping factor
cycles applied to the new thermal factor shifts (Number of thermal
damping cycles = JABN-JAPN). The damped thermal shifts are applied
to the coordinate shift set that gave the lowest R value earlier.
Thus
SHFTK(1-->JAPN) incremental damping factors for coordinates
SHFTK(JAPN+1-->JABN) incremental damping factors for thermal parameters.
For each cycle R values and statistics are given as described above.
Maximum allowed value of JABN = 10.
In Short:
JAPN Number of cycles for trying different values of positional parameter
damping factors to explore their effect on the R and scale of a "small"
NSAMPL set of reflections.
JABN Total number of cycles to be performed (or different damping factors
to be tried). First JAPN cycles use initial temperature factors
(changes in coordinates only). JAPN+1 to JABN cycles use
positional parameter shifts that gave minimum R in first JAPN cycles
and apply SHFTK (JAPN+1)...etc. to explore effect of damping
temperature factor shifts.
********** Notes on Multicycle jobs **********
This version of the program has been modified to allow multiple
refinement cycles to be run in the same job (parameter NFCYCL, card 1).
On fast computers (e.g. CRAY, FPS) this is often desirable, but there
are certain ramifications described below:
1) Between cycles the parameters AFSIG and BFSIG (card 8.1) are
adjusted such that the ratio of MODELED to FITTED values found
in the first cycle is preserved. Accordingly, one should not use
the multicycling option without first running a single calculation
to determine what AFSIG and BFSIG values to input. Good results are
usually obtained when the input values are 40-50% of the FITTED values
obtained from the previous output.
2) To minimize printout, lists of bad distances, torsions etc are
printed only during the first cycle, but summaries for each restraint
type, R factors and parameter shifts are printed for each cycle.
WARNING! For multiple cycles, specifying printout of structure factors
(LISTF, card 2) and/or complete parameter shifts (LISTA=1, card 2)
can lead to enormous printout.
3) All parameter shift sets created in a multicycle job are combined
and output on file ISHFTW as if they were computed in a single cycle.
Thus when the shift file is read back in (now on file ISHFTR), the
parameter JABN (card 10.1) is no longer equal to the number of
CYCLES performed since the last run of PROTIN, but is the number
of refinement JOBS run in that period (Actually it is the number
of shift sets present on the file).
4) Full parameter shifts are applied during all cycles run in
the job.
5) The scale factor shift is applied between cycles, and the new
scale factor is printed out. To resume where you left off after
a multicycle job, the value given in the last "NEW SCALE FACTOR = "
line should be input for the next job (parameter SC(1), card 9).
6) If an overall thermal factor is used (parameters ITEMP, card 3
and TO, card 9) it will be held fixed during all cycles. Shifts
are computed and listed on the printout, but they are not applied.
(In most cases the calculated shift is unreasonable anyway). If
individual thermal factors are used however, full shifts are applied
exactly as they would be in normal operation.
7) If more than one cycle is computed (NFCYCL (card 2) > 1), then
the R test (card 11) is not meaningful since the damping factors are
applied only to the shifts computed in the last cycle OF THE CURRENT
JOB. Thus any factor determined in this test SHOULD NOT be applied
to the shift set written to file ISHFTW when it is used in the next run.