
Next: Complex Queries
Up: Command Language
Previous: Actions
Spock has a sophisticated atom selection feature, allowing for the
creation of complex queries. The individual query elements can be nested,
and boolean ``and'' and ``or'' operations applied. Curly braces ``{''
and ``}'' can be used to group elements to change the order of
operations. Various numerical relationships can be indicated by the
symbols ``='' (equals), ``<>'' (not equals), ``>'' (greater than),
``<'' (less than), ``>= or =>'' (greater than or equals) and ``<=
or =<'' (less than or equals). For strings, only the ``='' and ``<>''
relations are valid. The exact syntax used to combine query elements will
be described in §5.4, after the simple query
elements have been introduced. A few definitions are necessary in order
to understand the syntax of the simple queries.
- A RANGE is specified as
something like ``(A,B)'' and stands for all items between
``A'' and ``B'' inclusive.
- A LIST uses square braces, and
specifies each item by enumeration. For example [A,D,F,E]
specifies only items A, D, F and E.
- A NUMBER is a simple, single number which may be
either an integer or a real number, depending on where it is used.
For RANGEs and LISTs only equals (``='') and not
equals(``<>'') are valid relationships, e.g. rn<>[1,2,3] or
an=(1,2).
Note: As a typographic convention, in the manual, a construct
like ``<RANGE|LIST|NUMBER>'' indicates that a RANGE, or
LIST or NUMBER may be used, but the angle brackets and vertical bar
are not a part of the expression. For a LIST, however, the
square brackets are a required part of the expression.
Perhaps the most common types of selections are based on the atom and
residue numbering. The commands take the form of:
- rn=<RANGE|LIST|NUMBER>: select on the residue number.
- an=<RANGE|LIST|NUMBER>: select on the internal
atom number.
- pdban=<RANGE|LIST|NUMBER>: select on the PDB
atom number.
The reason for the difference between pdban and an
is that spock assigns a sequential number for each atom it reads, and each
atom is uniquely identified by this number. If there is more than one
file read in, the first atom of the second file will not have an=1,
but rather, the an will be one greater than the last atom of the
first file. In each, case, the pdban is the number assigned to the
atom by the PDB file. Further, if there are missing residues in the PDB
file, PDB atom numbers may be missing. There are no corresponding gaps in
the an field. Several different atoms may have the same pdban.
- bc=1,rn=1: assign bond color 1 to residue 1
- bc=1,rn=(3,5): assign bond color 1 to residues 3-5
inclusive (residues 3,4, and 5).
- bc=1,rn=[1,8,20,42] assign bond color 1 to residues 1,
8, 20, and 42.
- lc=2,an>3: assign color 2 to labels for all atoms
except 1 and 2.
- ac=3,an<>2: assign color 3 for all atoms except atom 2
You can also
select groups of atoms by the atom, residue, chain, or subset name. Spock
follows PDB naming conventions in that atoms have names that are four
characters long, residues (and bases) have names of three characters, and
chains have 1 character identifiers. The commands are:
- a=name: select atoms named ``name''.
- a=<ba|sch>: selects backbone or side-chain atoms
respectively. Works for both proteins and nucleic acids. For
nucleic acids, the backbone is defined as the sugar and the
phosphate.
- r=name: select residues named ``name''.
- r=SET:
select residues belonging to the
pre-defined set SET. Possible values of SET are:
substrate, aliphatic, hydroxyl, sulfur, aromatic, charged, amide,
hydrophobic, polar, neutral, acidic, basic, small, medium, large,
cyclic, dna, aa, at, cg, purine, pyrimidine, and wat.
r=helix, r=sheet, r=turn and r=coil may also be used,
although the definition of those sets changes according to the
``structure'' property. See Appendix A for
the definition of the predefined sets. Note that spock is pretty
simple-minded about what it considers a substrate. Basically, if
it's not an amino acid, a nucleotide, or water, it's a substrate.
Non-standard resides will most likely be identified as substrates.
- ch=X: select chains identified via the single letter
``X''.
- altloc=X: select atoms with the alternate location
specifier ``X''. This is used by the PDB to indicate multiple
conformations of a single amino acid.
- icode=X: select atoms with the insertion code
specifier ``X''. This is used by the PDB to indicate residues
inserted into a sequence, and is intended to allow residue
numbering to be consistent between structures.
- sub=<name|number>: select atoms in the previously
defined subset called ``name'', or numbered ``number''. See §
6.8.
- vsub=<name|number>: select vertices in the previously
defined vertex subset called ``name'', or numbered ``number''.
See §6.8.
Spock string comparisons for selections are in all cases
converted to upper case before comparison. Note that spock has two modes
of string comparison, ``strict'' and ``loose''. Loose is the default
which means that spaces are ignored, and the wildcards ``?'' and ``$''
are supported (see below). Further, in loose mode, spock simply checks to
see if the query string is contained in the internal name. Therefore
``C'' would match ``CA'' and ``CB''. If you want to
match only atoms named ``C'' you must use strict comparisons, in
which the strings must match exactly, with the exceptions that spaces are
represented on the command line by the underscore character ``_'', and
all strings are converted to upper case. The correct query string would
then be ``_C__''. An underscore character in the query string
tells spock to use strict comparison, otherwise it uses loose string
comparison.
In order to use the wildcards ``?'' and ``$'' you must use quotes around
the string. The question mark stands for any single character and the
dollar sign stands for any ending. For example, r="AS?" will match
ASP and ASN, while r="A$" will match ALA, ASP, ASN, ARG, and so on.
For a quoted comparison of this type, you may use spaces, underscores are
not necessary. Using quotes does not imply strict comparison. For
instance r="AL" will match both ALanine and vALine.
- a=c: selects all atoms with ``C'' in the name. (C,
CA, CB...)
- a=CA: selects atoms with ``CA'' in the name.
-
a=_O__: selects ``O'' but not ``OE1'' or ``OE2''
- a=sch: selects all side-chain atoms
- ch>B: selects all atoms in chains whose identifier is
lexically greater than B, i.e. (C...Z).
- r="AS?": selects all atoms in ASP and ASN residues (or
any other last letter).
- r<>ala selects all atoms not in alanine residues.
Spock can also
select groups of atoms based on the atom properties fields, such as:
charge, distance, surface area, radius, etc., and vertices based on vertex
property fields, such as: vertex charge, vertex color, distance, etc. The
commands are shown below:
- q=<RANGE|LIST|NUMBER>: select based on charge
- d=<RANGE|LIST|NUMBER>: select based on the distance
property
- dp=<RANGE|LIST|NUMBER>: select on the atom number of
the distance partner
- sa=<RANGE|LIST|NUMBER>: select based on surface area
- p=<RANGE|LIST|NUMBER>: select based on potential
- ap1=<RANGE|LIST|NUMBER>: select based on general
property 1
- ap2=<RANGE|LIST|NUMBER>: select based on general
property 2
- bcnt=<RANGE|LIST|NUMBER>: select based on the number
of bonds the atom forms
- hbcnt=<RANGE|LIST|NUMBER>: select based on the number
of hydrogen bonds the atom forms
- radius=<RANGE|LIST|NUMBER>: select based on the radius
of the atoms.
- x=<RANGE|LIST|NUMBER>: select based on the
untransformed x coordinate of the atom
-
y=<RANGE|LIST|NUMBER>: select based on the untransformed y
coordinate of the atom
- z=<RANGE|LIST|NUMBER>: select
based on the untransformed z coordinate of the atom
- X=<RANGE|LIST|NUMBER>: select based on the
transformed X coordinate of the atom
-
Y=<RANGE|LIST|NUMBER>: select based on the transformed Y
coordinate of the atom
- Z=<RANGE|LIST|NUMBER>: select
based on the transformed Z coordinate of the atom
- vx=<RANGE|LIST|NUMBER>: select based on the
untransformed x coordinate of the vertex
-
vy=<RANGE|LIST|NUMBER>: select based on the untransformed y
coordinate of the vertex
- vz=<RANGE|LIST|NUMBER>:
select based on the untransformed z coordinate of the vertex
- vX=<RANGE|LIST|NUMBER>: select based on the
transformed X coordinate of the vertex
-
vY=<RANGE|LIST|NUMBER>: select based on the transformed Y
coordinate of the vertex
- vZ=<RANGE|LIST|NUMBER>:
select based on the transformed Z coordinate of the vertex
- vp=<RANGE|LIST|NUMBER>: select based on vertex
potential
- vd=<RANGE|LIST|NUMBER>: select based on the vertex
distance property
- vp1=<RANGE|LIST|NUMBER>: select based on vertex
general property 1
- vp2=<RANGE|LIST|NUMBER>: select based on vertex
general property 2
- iv=<RANGE|LIST|NUMBER>: select interactions based on
the interaction intensity value, or select atoms which have an
interaction with the specified intensity.
- it=<RANGE|LIST|NUMBER>: select interactions based on
the interaction type, or select atoms which have an interaction
with the specified type.
- ip=1, or ip=-1: select the most positive or most
negative interaction for each atom.
The ``distance'' and ``distance partner'' properties will be
explained fully in §6.6.1. For now it will
suffice to know that spock is able to calculate the distance between two
sets of atoms and store the minimum distance (on a per-atom basis) in the
distance property. This distance is the distance to some other
atom, called the distance partner. The atom number of the distance
partner is stored in the dp field.
The bcnt and hbcnt options are most
useful with the projection syntax described in §
5.4.1. With these properties, you can, for instance,
hide all residues which don't have any hydrogen bonds. The command
bc=0,bcnt=0 is also useful, in that it will turn off the ``cross''
representation of non-bonded atoms, such as waters or metal ions.
Note that there are two types of ``action commands'' §5.2 in
spock--those that apply to atoms, worms, and the like, and those that
apply to surfaces. Since each vertex in a surface is associated with an
atom (it's ``owner'') it makes sense to apply atom selections to vertices.
For example, the command vc=9,r=ala will color all surface vertices
``owned'' by alanine residues with color 9. To repeat, any atom selection
will be mapped to the appropriate vertices. However, going the other way
does not make sense, and is not allowed. For instance, the command
bc=0,vp>0, and attempt to hide bonds associated with vertices with a
positive potential is not allowed, and will fail. It's likely that any
given atom has several vertices associated with it, and there's no logical
way to determine which of the several vertices should be used to determine
the outcome of the comparison. Using vertex selections on vertex actions
is fine, of course. To summarize, you can use atom selections with atom
actions, ``promote'' atom selections to apply to vertices, or use vertex
selections on vertices. The valid vertex actions are vc, set vp, set
vp1, set vp2, and set vdistance.
The coordinate selection commands are particularly useful, as they can
quite quickly help to eliminate unwanted objects from the display. The
untransformed selections x,y,z,vx,vy and vz are
straightforward, the selection string selects based on the value of the
given coordinate in the world coordinate frame. The transformed
selections are quite different, however. Imagine the viewing volume on
the screen to be a cube centered at {0,0,0} and ranging from -1 to +1 in
each dimension. It is this coordinate frame that the transformed
selection commands use. For instance bc=0,X>0 would hide all bonds
on the right half of the screen, no matter what the current viewing
transformation. These commands are particularly useful with surfaces,
as they allow you to clip away the front part of a surface with ease. For
instance, with a surface defined, vc=0,vZ>0 will clip away the front
half of the surface. As above, atom coordinate selections applied to a
vertex action is permissible, the atom selection will be promoted to the
vertices, but the reverse is not possible.
Spock can also select by object or molecule number. Each molecule read in
is given a unique number, and the list of molecules is displayed after the
file is read, or one can be generated via a mlist command.
Similarly, each defined helix, sheet and turn is given a number, which can
be used to select.
- m=<RANGE|LIST|NUMBER>: select on molecule number;
selects all atoms in the specified molecule(s).
- model=<RANGE|LIST|NUMBER>: select on model number, for
NMR and theoretical structures with more than one model. Selects
all atoms in the specified molecule(s).
- vn=<RANGE|LIST|NUMBER>: select on the vertex number
(for surfaces). Vertex numbers are assigned sequentially when
surfaces are constructs.
- s=<RANGE|LIST|NUMBER>: select on surface number. Each
surface that's built is assigned a number sequentially. This
selection may be used to limit commands to a particular surface.
- hn=<RANGE|LIST|NUMBER>: select on helix number;
selects all atoms in the specified helix/helices.
- sn=<RANGE|LIST|NUMBER>: select on sheet number;
selects all atoms in the specified sheet(s).
- tn=<RANGE|LIST|NUMBER>: select on turn number; selects
all atoms in the specified turn(s).
- nn=<RANGE|LIST|NUMBER>: select on annotation (note)
number; useful for setting the color of annotations.
- in=<RANGE|LIST|NUMBER>: selection interaction number;
useful for setting the color of interactions.
- element=<RANGE|LIST|NUMBER>: select on atom type
number, selects with the specified atom type. The atom type is
the element number, unless the atom is a CA, in which case it gets
the special number 0.
The negation of the
helix, sheet and turn selections is not immediately obvious. Since the
hn=1 command selects all atoms in helix 1, should hn<>1 select
all atoms not in helix 1, or should it select all atoms in
helices which are not numbered 1? Spock takes the latter choice, since
the user is selecting on helix, or sheet or whatever. Therefore,
bc=1,hn<>3 will only color helices that aren't numbered 3, not all atoms
that aren't in helix 3. More information about helix and sheet commands
is given in §6.4.8 and 6.4.9.
The color properties can also be used to select groups of atoms. Here it
is important to remember the ``Action, Selection'' order, because the same
string is used to select on atom color as to set the atom color.
- bc=<RANGE|LIST|NUMBER>: selects atoms with the given
bond color
- ac=<RANGE|LIST|NUMBER>: selects atoms with the given
atom color
- wc=<RANGE|LIST|NUMBER>: selects atoms with the given
worm color
- lc=<RANGE|LIST|NUMBER>: selects atoms with the given
label color
- vc=<RANGE|LIST|NUMBER>: selects atoms with the given
vertex (surface) color
You can, of course, specify a coloring action and a color-based selection,
in fact, you may often want to do this to change objects colored with one
color to another color, as in the first example below.
- bc=4,bc=2: color bonds with color 4 for atoms whose
bonds are now colored 2
- list,ac=3: list all atoms which are colored with color
3
- lc=d,lc=9: now use the default color for all labels
that were colored with color 9.

Next: Complex Queries
Up: Command Language
Previous: Actions
Jon Christopher
Tue Sep 14 16:44:48 CDT 1999