Description of NUCLIN.DAT


Line 1 : title (20A4)

Line 2 : I1,I2,I3,I4
Print output request for input (I1=1),for distances (I2=1), for planes and chiral volumes (I3=1), andfor non-bonded contacts (I4=1) on the NUCLIN.OUT file. I# = 0 suppresses output.

Line 3 : Sequence of first strand (5'- to 3'-end) (70a1)
If longer than 70 residues, insert "-" after the 70th residue (ex. ...aucgtcgat-)

Line 4 : Sequence of second strand (5'- to 3'-end)

Repeat lines 3 and 4 if other strands.

Line 5 : Type(s) of ligands (if any).

End with a blank line.

-------------------------------------------------------------------------------

The nomenclature for the bases is the following :

A : adenine;
G : guanine;
C : cytosine;
U : uracil;
T : thymine;
D : dihydrouracil;
P : pseudouracil;
Y : Y-base.

The nomenclature for the ligands is the following :

S : spermine; I : magnesium, cobalt hexammine;
O : acridine orange, proflavine; W : water;
N : user's drug (contained in DRUG.DC)

The nomenclature for the atoms the following (M = CH3) :
for the sugar :
O3P P O1P O2P O5' C5' C4' O4' C1' C2'
O2' C2M C3' O3'
for purines :
N9 C4 C8 N7 C5 C6 N1 C2 N3 O6
N6 N2 M7 M21 M22 M1 M2
for pyrimidines :
N1 C2 C6 N3 C5 C4 O4 N4 O2 M5
for pseudouracil :
C5 C4 C6 N3 N1 C2 O2 O4 O6
for dihydrouracil :
N1 C2 C6H2 N3 C5H2 C4 O4 O2
for the Y-base :
N9 C4 C8 N7 C5 C6 N1 C2 N3 M3
O6 N2 C11 C13 C12 C14 C15 C16 C17 O18
O19 C20 N21 C22 O23 O24 C25
for spermine :
N1 C2 C3 C4 N5 C6 C7 C8 C9 N10
C11 C12 C13 N14
for hydrated magnesium and cobalt hexamine :
MG W1 W2 W3 W4 W5 W6 CO NH31 NH32
NH33 NH34 NH35 NH36
for acridine orange or proflavine :
C1 C2 C3 C4 C5 C6 C7 C8 C9 N10
C11 C12 C13 C14 N15 C16 C17 N18 C19 C20
for water :
W01 W02 W03 W04 W05 W06 W07 W08 W09 W10
W11 W12 W13 W14 W15 W16 W17 W18 W19 W20
W21 W22 W23 W24 W25

Each residue assigned to water can have only 25 oxygen atoms. If more are needed, add a residue.

For DRUG.DC (N or #11)

It can easily be adapted for other specific purposes. The type 11 (now distamycin) can have up to 100 atoms. With the program "DICT", it is easy to change the drug part with some other large RNA/DNA-binding molecule. Do not forget to change/remove the chiral centers and the planar atoms in DRUG.DC. If you need to compute chiral volumes, use the little program CHIRCALC.

-------------------------------------------------------------------------------

If standard restraints on Watson-Crick base pairs :

Next line :
jcode (i5) normally 3 or 4 (distance code) base pairs can additionally be specially restrained with the file HBND.DAT (see below). If you do not want this option, jcode=0.

then :
5(a1,i3,4x,a1,i3,4x) residue name and numbers for restrained base pairs with 5 base pairs per line, end with a blank line.
Case of a tetramer d(C-G)
3C 1 G 8 G 2 C 7 C 3 G 6 G 4 C 5

-------------------------------------------------------------------------------

Next line : XRAY or MODEL (a4)
With MODEL, the LSQ.DAT is set up so that IDALIZ=1 and ITEMP=0.

Next line : IP,IG,IT,IS,IO (5I5)
This allows you to choose different options : if variable is 1, the option is activated; if zero, not. They are independent of each other. But each option requires some input lines and their order, as below, must be respected. If an option is not chosen, no input line is necessary.

-------------------------------------------------------------------------------

Option IP :
For enforcing the pseudorotation parameters P and Tm (phase and amplitude of pseudorotation of each sugar).
If IP=1, next lines must be :

NUCL (I5)
IRES,IWP,IWTM,TAGP,TAGTM (3I5,2F6.1)
0

with NUCL = 3 for DNA / = 4 for RNA

and for each restrained sugar pucker
IRES = residue number
IWP = 1/2 tight/loose restraint on the phase
IWTM = 3/4 tight/loose restraint on the amplitude
TAGP = target value of the phase
TAGTM= target value of the amplitude
END : IRES=0

This routine will replace the values of the chiral volumes and of the sugar ring endocyclic angles by their correct values depending on the values of P and Tm following empirical relationships. Both soft and strong restraints on the chiral volumes are calculated. Since this procedure is time consuming, this option should be taken only when sugar puckers others than the main C(2')-endo and C(3')-endo puckers are targetted.
By "soft restraints" are meant the usual chiral centers in the sugar ring. By "strong restraints" are meant "pseudo" chiral centers in the sugar ring (for example between C(1') andO(4'), C(4'), C(2'). The soft restraints alone are not very good at producing sugar rings with correct chiral volumes and endocyclic torsion angles (i.e. meaningful pseudorotation parameters).

For references on the way sugar puckers are treated here, see :
C. Altona and M. Sundaralingam, J. Am. Chem. Soc. 94,8205 (1972).
E. Westhof and M. Sundaralingam, J. Am. Chem. Soc. 102,1493 (1980).
S.T. Rao, E. Westhof, and M. Sundaralingam, Acta Cryst. A37,421 (1981).
E. Westhof and M. Sundaralingam, in "Structure and Dynamics of Nucleic Acids and Proteins", (E. Clementi and R.H. Sarma, eds.), Adenine Press, New-York, (1983).

-------------------------------------------------------------------------------

Option IG
For enforcing chiralities for the sugar atoms in either the C(2')-endo domain or the C(3')-endo domain. If desired, the chiralities can be left as is (i.e. soft restraints in only one puckering domain : C(3')-endo).

If IG=1, the next lines must be :

NUCL (I5)
NTIM,N1,N2,N3 IF IP=0 (16I5)
or NTIM,N1,N2,N3,N4 IF IP=1 (16I5)

with NUCL = 3 for DNA / = 4 for RNA

NTIM,N1,N2,N3,N4,N1',N2',N3',N4',N1'',N2'',N3'',N4'', etc..., where, following the sequence of residues,
N1 is the number of C(3')-endo sugars
N2 is the number of C(2')-endo sugars
N3 is the number of unchanged sugars
N4 is the number of sugars under pseudorotational restraints (if IP=1)
NTIM is the number of data to read.

This enforces strongly pure C(3')-endo (P=18 deg, Tm=39 deg) or C(2')-endo (P=162 deg, Tm=39 deg) puckers. Note, however, that the sugar geometry is left unchanged. In case no peculiar pucker is wanted, this is the best option.

-------------------------------------------------------------------------------

Option IT
For restraining torsion angles of the sugar-phosphate backbone and of the dihydrouracil ring .

If IT=1, next lines must be :
IRES,IANG,IWT,TAG (3I5,F9.2)
0

where for each restrained torsion angle :
IRES = residue number
IANG = torsion number
IWT = 1/2/3 tight/medium/loose restraint
TAG = target value of the torsion angle

The torsions are numbered in the following way:

Omega=1, Phi=2, Psi=3, Psi'=4, Phi'=5, Omega'=6, Chi=7
where
Omega = O(3')-P-O(5')-C(5')
Phi = P-O(5')-C(5')-C(4')
Psi = O(5')-C(5')-C(4')-C(3')
Psi' = C(5')-C(4')-C(3')-O(3')
Phi' = C(4')-C(3')-O(3')-P
Omega' = C(3')-O(3')-P-O(5')
Chi = O(4')-C(1')-N1-C6 for pyrimidines
Chi = O(4')-C(1')-N9-C8 for purines

For reference on nucleic acid nomenclature, see :

B. Pullman, W. Saenger, V. Sasisekharan, M. Sundaralingam, and H.R. Wilson, Jerusalem Symposium Quantum Chemistry Biology 5, 815-820 (1973).

It might seem dangerous to restraint torsion angles. This should be done judiciously, as for example in helical stems. If IP=1 or IG=1, it is superfluous to restrain Psi'. Phi and Phi' are the least dangerous to restrain (around 171 deg. and 206 deg., respectively, in RNA helical stems).

Torsion number = 8 will restrain the torsion angles around the ring in the dihydrouracil ring.In which case the program will expect the next line of the file to give the six values of those torsion angles in the order :
N1-C2-N3-C4 e.g. (-/+ 11. deg.)
C2-N3-C4-C5H2 e.g. (0.0 deg.)
N3-C4-C5H2-C6H2 e.g. (=/- 30. deg.)
C4-C5H2-C6H2-N1 e.g. (-/+51. deg.)
C5H2-C6H2-N1-C2 e.g. (+/- 46. deg.)
C6H2-N1-C2-N3 e.g. (-/+16. deg.)

In the examples, the right sign corresponds to the half-chair C(6)-exo-N(1)-exo conformation and the left one to the half-chair C(6)-endo-N(1)-endo conformation.
See
M. Sundaralingam, S.T. Rao, and J. Abola, J. Am. Chem. Soc.93, 7055 (1971).
J. Emerson and M. Sundaralingam, Acta Cryst. B36, 537 (1980).

End : IRES = 0

-------------------------------------------------------------------------------

Option IS
For the treatment of chains related by non-crystallographic symmetry. Can be used for restraining contacts between symmetrically-related parts of a molecule (in conjunction with IO=1).

If IS=1, next lines must be :

IGR (I5)
NCHN,KNOWR,(ID(K),K=1,NHCN) (16I5)
(NBE(K),K=1,NCHN),(NEN(K),K=1,NCHN) (16I7)
If (KNOWR.NE.0) then
((R(I,J),J=1,3),I=1,3),(T(I),I=1,3) (12F9.4)
(IWG(I),I=1,3) (16I5)

where

IGR = number of groups (max.4)

And for each group :

NCHN,KNOWNR,(ID(K),K=1,NCHN)
NCHN = number of chains (max.8)
KNOWNR = 0/1 non-knowledge/knowledge of rotation-translation matrix
ID = chain-identification

(NBE(K),K=1,NCHN),(NEN(K),K=1,NCHN)
NBE = atom number of chain beginning
NEN = atom number of chain ending (max. 500 atoms per chain)

If (KNOWNR.NE.0) , for each of the NCHN chains, the rotation-translation matrix between chain NCHN and chain 1 (as in International Tables; thus vector t in Angstroms : X'= Gx'= GRx+Gt) :
((R(I,J),J=1,3),I=1,3),(T(I),I=1,3)

(IWG(I),I=1,3)
IWG = 1,2,3 / tight,medium,loose restraint for positions and B-factors

Presently,
IWG(1) is assigned to phosphate groups
IWG(2) is assigned to sugar groups
IWG(3) is assigned to bases

-------------------------------------------------------------------------------

Option IO
For partial but fixed occupancies (the occupancy is not refined but the temperature factor can be). Naturally, partial but fixed occupancies can be given directly in ATOMS.DAT without the use of this option. For the refinement of occupancies, see the write-up of "NUCLSQ".

If IO=1, next line must be :

NBG1,NED1,NBG2,NED2,QF1 (4I5,F5.2)

where

NBG1,NED1 : first and last atom # in the atoms.dat file with the partial occupancy QF1
NBG2,N2D2 : first and last atom # in the atoms.dat file with the partial occupancy (1.-QF1)
QF1 : partial but fixed occupancy of the first group of atoms.