Using Curves+

 

INPUT DATA


The example below shows a simple input file. The compiled program is in the directory /Users/RL/Code and has the name Cur+. The compiled version is obtained by typing "make" in the directory containing the detarred source code and after having checked which Fortran compiler you have available (and if necessary changing the compiler in the Makefile (gfortran by default).

The notation <<! allows the input data to be placed immediately after the line which calls the program. The input is then ended by the explanation mark.

The example analyzes a duplex DNA from the file 1bna.pdb in the directory /Users/RL/Code/str. The extension .pdb need not be given. The ouptut file will be r+bdna.lis (in the Code subdirectory) and the base and backbone reference geometries will be taken from the files standard_b.lib and standard_s.lib (again in the /Code subdirectory).

The following two lines give the numbers corresponding to the order of the subunits containing the nucleotides which constitute each strand (1=1st, 2=2nd, …). Note proteins, water and HETATM are automatically removed at input. Contiguous numbers can be indicated using a colon (I:J). Note bases can be excluded from axis calculations by including them in the I/P lines as negative numbers (useful for flipped out bases or abasic nucleotides) and gaps are indicated by zeros. Nucleotide numbers in each strands should be organized so paired bases occur in identical positions.

rm r+bdna*.*

/Users/RL/Code/Cur+ <<!

&inp file=str/1bna, lis=r+bdna,

lib=/Users/RL/Code/standard,

&end

2 1 -1 0 0

1:12

24:13

!


NAMELIST VARIABLES


CHARACTER (strings without quotes):

file:  file name for input structure (.pdb and .mac extensions need not be given in input, use .trj for MD trajectories)

ftop:  name (with extension) of file with topological data (AMBER format) for MD trajectory analysis

lis: root file name for output (.lis, .cda, _X.pdb, _B.pdb)

lib: root file name for base (_b.lib) and backbone (_s.lib) geometry files

back (P) : atom used to define backbone. Different backbone can use different atoms if necessary (e.g. P/C5* - a slash is used to separate input names).


LOGICAL SWITCHES (enter as .t. or .f., the default value of each variable is given):

circ (.f.): .t. implies a closed circular nucleic acid

line (.f.): if .t. find the best linear helical  axis

zaxe (.f.): if .t. use the Cartesian Z-axis as the helical axis

fit (.t.):  if .t. fit a standard bases to the input coordinates (important for MD snapshots to avoid base distortions leading to noisy helical parameters)

test (.f.): if .t. provide addition output in .lis file on fitting and axis generation

refo (.f.): if .t. use the old Curves convention for defining the reference frame of each base, rather than the more recent Tsukuba convention


REAL DATA (default value of each variable is given):

wback (2.9Å) radius of backbone around spline through phosphate atoms for groove width calculations

wbase (3.5Å) half-width of base pairs for groove depth calculations


INTEGER DATA (default value of each variable is given):

isym (1):  symmetry repeat for generating helical axis (1 = mononucleotide, 2 = dinucleotide e.g. ATATAT…, any positive value is allowed)

itst (0): first snapshot to analyze (if reading MD trajectory)

itnd (0): last snapshot to analyze (if reading MD trajectory)

itdel (1): spacing of snapshots to analyze

n.b. if itst=n and itnd ≤ itst, analyze only snapshot number "n"


INPUT FILES


Structure

Cur+ can analyze single structures from .pdb or .mac filest (the latter is only of interest for users of JUMNA) or a series of snapshots from an AMBER MD simulation (a .trj file). Analyzing a trajectory also requires reading a topology files which defines the atom types and the order of atoms which will be found in the .trj file. Note that only ATOM lines are used in .pdb files and that unit numbers are not used - numbers in the Curves+ input refer to order in which subunits appear in the .pdb file.


Libraries

standard_b.lib contains the standard geometry for the bases which can ben analyzed. Below is the data for cytosine:

C Y 7 'Cytosine'

1.94866 -1.45161 -0.19029 'C1*'

0.74385 -0.58352 -0.07229 'N1'

0.93020  0.79520 -0.12929 'C2'

-0.15547  1.60399 -0.02329 'N3'

-1.37969  1.08626  0.13371 'C4'

-1.59254 -0.32937  0.19471 'C5'

-0.49500 -1.12092  0.08671 'C6'

The first line gives the one-letter code for the base, R/Y specifying purine or pyrimidine, the number of ring atoms (plus the sugar C1*) and the full base name  (≤ 8 charas.). The following lines give the x,y,z coordinates of each atom and the atom name. The first three atoms must be C1*, the base atom bound to C1* and the atom used to define the base normal using the vector product (1-2)x(3-2). This data is used for least squares fitting of standard base geometries. The base atoms need not be in the standard order in the structural input.

standard_s.lib contains the description of the backbone geometry to be analyzed. Standard lines start with a blank and define a single torsion by giving 4 atom names (≤ 4 chars) followed by the name of the torsion name (≤ 6 chars). Lines starting with 'B' define torsions which depend on the type of base: atoms 1-4 are used with purines and atoms 5-8 are used with pyrimidines

Lines starting with 'S' define sugar rings and give the names of the 5 ring atoms in the order used for pseudorotation calculations. the data below is for standard  DNA. Non-standard bases or backbones can be analyzed by modifying the .lib files.


'-O3*'  'P'   'O5*' 'C5*'  'Alpha'

  'P'    'O5*' 'C5*' 'C4*'  'Beta'

  'O5*'  'C5*' 'C4*' 'C3*'  'Gamma'

  'C5*'  'C4*' 'C3*' 'O3*'  'Delta'

  'C4*'  'C3*' 'O3*' '+P'   'Epsil'

  'C3*'  'O3*' '+P'  '+O5*' 'Zeta'

B 'O4*'  'C1*' 'N9'  'C4'  'O4*'  'C1*' 

                ...  'N1'  'C2'   'Chi'

S 'C1*' 'C2*' 'C3*'  'C4*' 'O4*'


OUTPUT FILES

The .lis output from Cur+ lists the input parameters, the base sequence of each strand, then: (A) base pair-axis parameters; (B) intra-base pair parameters; (C) inter-base pair parameters: (D) backbone parameters and (E) groove parameters. In section (C), the first six parameters, including rise and twist correspond to the transformation between successive base pairs. H-Ris and H-Twi correspond to the translation and rotation of successive base pairs along and around the helical axis. With the base pair-axis parameters,  H-Ris and H-Twi are useful for understanding the overall helical structure of the fragment analyzed. Graphic output is a _X.pdb file containing the helical axis and a _B.pdb file showing the backbone splines and the vectors defining groove widths. Analyzing snapshots from an MD trajectory (which requires both trajectory and topology I/P files) suppresses the parameters in the .lis output. Both single structures and trajectories generate an unformatted data file (.cda) which can be analyzed with the program Canal - see separate documentation.