Back to RESOLVE table of contents

Introduction to RESOLVE



Why another density modification approach?

Although density modification (solvent flattening, non-crystallographic symmetry, phase extension, histogram matching, etc.) has been a very powerful tool, its potential is much greater than has been achieved so far. There are two reasons for this:


Problems with the phase recombination approach to density modification.

RESOLVE uses a statistical approach to density modification, while other methods use an approach in which a map is modified to meet expectations and the new phases are recombined with experimental phases.  For the mathematical details, see the  references for RESOLVE .  You might also wish to see the discussion and extensions in Kevin Cowtan's article "Gaussian Likelihoods in real and reciprocal space" in the CCP4 newsletter.

Principal problems with the phase recombination method

What is the optimal relative weighting of modified and experimental phases? Incorrect relative weighting means that the final results will not be optimal

Incorrect weighting terms mean that the final figures of merit are almost always inflated

When do you stop iterating? In some approaches the maps initially get better, then get worse unless you stop



 
 

A statistical approach to density modification

Density modification can be thought of as a way to adjust crystallographic phases (or amplitudes) to make them simultaneously consistent with the experimental data and with our expectations of what an electron density map should look like. The statistical approach is a mathematical way to formulate this statement. By using this formulation, the weighting factors and problems with convergence are taken care of automatically.

In RESOLVE, any set of structure factor amplitudes and phases has an associated probability composed of two simple parts:

Probability of a set of phases (and amplitudes)

The probability of the experimental phases This is the probability that you would have observed your experimental data if this set of phases (and amplitudes) were correct
The lprobability of the map This is the probability that the electron density map calculated from this set of phases is drawn from the set of plausible electron density maps for this structure

RESOLVE adjusts your crystallographic phases so as to maximize the total (posterier) probability of those phases. The mathematics is a little complicated but the idea is very simple. To see the mathematics in detail, have a look at  T. C. Terwilliger (2000) "Maximum likelihood density modification," Acta Cryst. D56, 965-972.

Note on terminology:  The approach used by resolve is now called "Statistical density modification," a name suggested by Kevin Cowtan.  It used to be called "Maximum-likelihood density modification", using the term "likelihood" in a colloquial sense of probability. The old name (as pointed out by Gerard Bricogne and others) is confusing because the maximum-likelihood method is a specific technique that uses a specific definition of "likelihood" that is not used in this approach.  Sorry to all for the confusion, and hoping that it will now be more clear. The mathematics remains exactly the same.


Using all the available information for density modification

Density modification is usually thought of as a process that is carried out on an experimental electron density map prior to model building, but iterative model-building methods such as ARP/wARP can also be thought of as density modification techniques. With the statistical approach, partial model information can be seamlessly incorporated into the total expression for the probability of the phases. This allows a hierachical approach to incorporating information about phase probability:

Types of information that can be used in statistical density modification

Experimental phases (if available)
Low-resolution structural information (solvent boundary)
Non-crystallographic symmetry
Partial model information (molecular replacement or model building)
Full atomic model information

The current version of RESOLVE can incorporate all of these types of information.


Carrying out density modification with RESOLVE

RESOLVE carries out density modification on several levels:

Each "mask cycle":

RESOLVE estimates the probability that each point in the map is within the protein or solvent region (a probabilistic "mask")
RESOLVE refines NCS symmetry operators, if present
RESOLVE then carries out one or more minor cycles:
Fitting of the histogram of density in the protein and solvent regions to model histograms (yielding beta = quality of this fit, and sigma= the overall error in the map)
Estimation of target density (a probility function)at each point based on these histograms for solvent and protein regions
Estimation of target density and uncertaingy at each point from NCS or a model map, if present
Calculations of derivatives of map probability with respect to phases
Estimation of phase probability from experimental phase probabilities and the map probability function
RESOLVE carries out mask cycles (up to 5) until no further changes occur in the phases.

If NCS is present, then RESOLVE carries out an initial mask cycle, not including any NCS, to estimate uncertainties in density estimated from NCS copies.  Then RESOLVE carries out another initial mask cycle, using NCS but not solvent flattening, to estimate "sigma", the overall error in the map.

If "use_input_solv" is not set and "hklstart" is not specified, then RESOLVE uses the R factor to estimate the solvent content of the crystal. Solvent contents from 0.1 to 0.9 are tested, and the value leading to the minimum R is chosen. This optimal solvent content is written to the file "resolve.solvent."  Note: if "use_input_solv" is specified, then RESOLVE assumes that the solvent content is already known and reads it from "solvent_content" if specified, or else from "resolve.solvent" if present, or else the default (0.40) is used.

RESOLVE also uses the R-factor to identify which histogram of solvent densities and protein densities to use in density modification. The file "rho.list" in $SOLVEDIR/segments/ contains several histogram profiles, all based on model electron density maps. These are at resolutions from 1.2 A to 4 A.  RESOLVE carries out a test of each histogram initially and chooses the one leading to the lowest R factor. The histogram can be set using "database". The optimal database entry is written to "resolve.database".

Resolve estimates the optimal smoothing radius using a simple formula.  For cycles where no density modification has occurred yet (first cycle normally, unless "phases_from_resolve" has been set), R is set with the equation:  R=2.41 (dmin)**0.9 (fom)**-0.26.   For all other cycles (after density modification has begun), the smoothing radius is 4 A.  These can also be set with "wang_radius", "wang_radius_cycle", "wang_radius_start", or "wang_radius_finish".

If "n_restore" is set by the user to be non-zero (default = 0), then after the phases have converged, the whole process is repeated again, starting with the original phases, but using the current probabilistic solvent mask.  This allows an optimized mask to be used in the "first" cycle of density modification.
 


Removing model bias with prime-and-switch phasing

Electron density maps obtained using phases calculated from atomic models often show peaks at the coordinates of atoms in the models, even when those atoms are incorrectly placed.  This effect can be reduced by careful weighting such as can be accomplished by Randy Read's SIGMAA approach, but it cannot be eliminated unless the phases are changed.

Prime-and-switch phasing is a way to remove model bias by using statistical density modification, but without including the phase information coming from the model once an initial map has been calculated.

The basic procedure is simple:


The initial biased phase information from the model is required to get the procedure going.  The final phases are essentially unbiased by the model because they are based on the features of the map, not on the prior phase probabilities.

The final phases are generally improved the most when:

There are some ways that prime-and-switch phasing can have residual bias: There are some cases where prime-and-switch phasing does not yield a nice-looking map
NCS averaging in RESOLVE

Non-crystallographic symmetry is an important source of information about the probabiltiy of an electron density map.  RESOLVE can begin with transformation matrices and an estimate of the center-of-mass of molecule 1 that you input.  RESOLVE can also figure out the transformations and center-of-mass automatically from the NCS in heavy-atom sites in a PDB file (if the default file "ha.pdb" exists and you don't specify NCS transformations, RESOLVE will try to find the NCS in those sites).
 

  •  RESOLVE uses NCS information in the following way (see Terwilliger, T. C. 2002 "Statistical density modification with non-crystallographic symmetry". Acta Cryst. D58, 2082-2086 and  Terwilliger, T. C. (2002). "Rapid Automatic NCS identification Using Heavy-Atom Substructures" Acta Cryst. D58, 2213-2215.)


  • Local pattern matching in RESOLVE
    RESOLVE can use the local patterns of density in your electron density map in statistical density modification to improve crystallographic phases. The basic idea is that on a local level (within a sphere of radius 2 A) there are patterns of electron density that are associated with high density at the center of the pattern, and other patterns associated with low density at the center. RESOLVE goes through your electron density map, and at each point it compares the nearby density with a set of 20 templates (it does not use the density at the point of interest or right around it in this analysis). RESOLVE_PATTERN uses this analysis to come up with a new estimate of the density at each point in the map. This new estimate of density (the "image") has the remarkable property that errors in the image are almost uncorrelated with errors in the map used to create it. This means that phase information from the "image" can be combined with phase information from other sources in a simple way. You can see the details of all this in Terwilliger, T. C. (2003) Statistical density modification using local pattern matching. Acta Cryst. D59, 1688-1701.

    The resolve_build script below uses image-based phasing.  Image-based phasing is the use of an electron density map that typically comes from either an atomic model or from pattern-matching or from NCS, along with observed values of FP, to estimate phases.  The process results in phases and figures of merit similar to those obtained with Randy Read's SIGMAA, but the values come directly from map-probability phasing. The electron density map provided is used as a target for statistical density modification: crystallographic phases are found that, when combined with observed amplitudes, give a map that is as close as possible to the target map.  The figures of merit reflect how precisely each phase can be determined using this approach. The phases from image-based phasing are not the same as those from an FC calculation and they are not always unimodal like FC, SIGMAA or Sim-weighted phases.


    Fragment identification in RESOLVE

    RESOLVE can carry out an FFT-based search for fragments of structure (currently helices, strands), refine the locations of these fragments, and use them in density modification even if a complete model cannot be built.  The approach to finding fragments ("Maximum-likelihood density modification with pattern recognition of structural motifs",Terwilliger, T. Acta Cryst D. 57, 1755-1762; 2001) is very similar to Kevin Cowtan's FFT-based search (Cowtan, K., Acta Cryst D54, 750-756, 1998).  A template consisting of averaged helical density (or strand density) is rotated over a range of orientations designed to cover most possibilities within about  20 degrees and an FFT convolution is carried out for each orientation to find locations where the template and map match.  The best matches are identified and the orientiations and positions are refined. Then a pseudo-map is constructed consisting of the original templates, oriented based on the refined positions found in the search, and weighted by the local correlation coefficient. This pseudo-map is used as a source of phase information through map-probability phasing (Map-likelihood phasing", Terwilliger, T., Acta Cryst., D57, 1763-1775). This approach is similar to the one described in the original publication ("Maximum-likelihood density modification with pattern recognition of structural motifs",Terwilliger, T. Acta Cryst D. 57, 1755-1762; 2001) but works much better than the original method.

    Fragment identification is normally carried out right after model-building because the same FFT search can be used for both. The resolve build script includes it.


    Automated model-building and iterative model-building in RESOLVE

    After the completion of density modification, RESOLVE builds a model of your structure.  For versions 2.02 and higher, the model needs sequence information from you. You specify a file with the keyword "seq_file" and RESOLVE expects a sequence of amino acids in 1-letter format. If there are more than one type of chain, RESOLVE expects them separated by a line containing ">>>". . Typically RESOLVE can build 70-90% of the residues for a good map at 2-3 A resolution.  You can tell if the model is correct by noting how good the match is to the sequence and by noting the NCS correspondence among chains (if NCS exists). The PDB file that RESOLVE writes out will have the model and also as HETATM records at the end with the heavy atom sites from SOLVE output file ha.pdb.

    You can read all the details about RESOLVE automated model-building in Terwilliger, T. C. (2002). Automated main-chain model-building by template-matching and iterative fragment extension. Acta Cryst. D59, 34-44 and Terwilliger, T. C. (2002). Automated side-chain model-building and sequence assignment by template-matching. Acta Cryst. D59, 45-49.

    RESOLVE now has superquick model building!  The standard RESOLVE model-building for version 2.05 and higher is about 3 times faster than earlier versions. This is made possible by a more selective choice of which fragments to consider extending (no need to work on a fragment that covers a region that is already built).  Versions 2.05 and higher also have the option of "superquick_build" which is about 10 times faster than previous versions of RESOLVE model-building. For a very good map (one where RESOLVE can build >80% of the model) superquick_build typically gives almost the same model as the standard build.  For a moderate-quality map, the standard build or even the "thorough_build" may give up to 10% more model built.

    RESOLVE versions 2.05 and higher include cycles of model-building in which the thresholds for fit of the model to the map are sequentially lowered. This allows much more of the model to be built, while keeping the accuracy of most of the model high. You can use "aggressive_build" to try and build as much as possible, or "conservative_build" to build only the best parts.

    RESOLVE versions 2.06 and higher include the capability of identifying fragments (helices; strands) in a map and including them in density modification

    RESOLVE builds a model in the following way.

    RESOLVE (versions 2.06 and higher) can carry out pattern identification, fragment identification, density modification, and iterative model-building and refinement in combination with refmac5 (versions 5.1.24 and higher only!) RESOLVE (versions 2.03 and higher) can also carry out iterative model-rebuilding.  This is like model-building except that you start with just a model of some kind and measured amplitudes and resolve does everything from there. This works much more slowly than model-building with experimental phases. RESOLVE (versions 2.06and higher) can automatically evaluate a model, given a set of amplitudes FP (and phases PHIB and FOM if available).  First RESOLVE will rebuild the model (to reduce any bias due to refinement). Then RESOLVE will calculate a prime-and-switch composite omit map (as used in rebuilding) based on the rebuilt model and any phase information you give it. Then RESOLVE will compare the original model to this map and summarize the fit for you.