MOSFLM 6.0 User Guide

This describes MOSFLM version 6 for processing image plate and CCD data

(29 June 1999)

Andrew G.W. Leslie
MRC Laboratory of Molecular Biology
Hills Road,
Cambridge CB2 2QH
UK

E-mail: andrew@mrc-lmb.cam.ac.uk
Tel (+44) (0) 1223-248011

Any constructive comments on this User Guide would be very welcome.

Index

Major Changes

Help Library

Important notes

1: Overview

1.1 Programs covered in this guide 1.2 Input and Output files 1.3 Allowed detector types 1.3.1 Using the DETECTOR keyword 1.3.2 Using the SITE keyword 1.3 Allowed detector types 1.4 Inspection of images

2: A Quick Guide

2.1 Startup keywords 2.2 Autoindexing 2.3 Estimating mosaic spread 2.4 Running the Strategy option 2.5 Determining oscillation angles with the TESTGEN option 2.6 Integrating the first image to determine if the exposure time is OK 2.7 Interpreting those "WARNING" messages 2.8 Getting accurate cell parameters 2.9 Integrating a block of images 2.10 Integrating the dataset

3: Determination of crystal orientation, cell parameters and spacegroup

3.1 Autoindexing Interactively 3.2 Autoindexing when running the program in background

4: Running the STRATEGY and TESTGEN options

4.1 Overview of the STRATEGY option 4.2 Some Examples of the STRATEGY options 4.3 Determining the oscillation angle for each image (TESTGEN option)

5: Determining Accurate Cell parameters

5.1 Using Post-refinement to refine the cell

6: Collecting data and processing the images

6.1 Overview 6.2 Special MOSFLM features 6.2.1 Accumulating profiles over several images 6.2.2 Addition of partials (ADDPART) 6.2.3 Post-refinement of orientation and cell parameters 6.2.4 Optimisation of measurement box parameters 6.3 Running a processing job 6.3.1 Running MOSFLM interactively 6.3.2 Processing the first block of data) (Non-interactively) 6.3.3 Finally, Processing the dataset

7: Interpreting the output

7.1 The log files 7.2 The summary file 7.3 Checking the quality of the data

8: General tips.

8.1 Estimating the GAIN of a detector 8.2 Processing images with no (or very few) fully recorded reflections 8.3 Processing images when the spots are not fully resolved 8.4 Processing data from other detectors, or standard detectors with different rotation axis orientation.

9: Example command files

9.1 Autoindexing an initial image (interactively) 9.2 Determining an accurate cell 9.3 Integrating a series of images

Appendix I

Changes in MOSFLM

Appendix II

Setting the measurement box parameters manually

Appendix III

Overview of the MOSFLM program

Appendix IV

Definition of coordinate systems

MAJOR CHANGES

Changes since 6.00

Cell refinement added to the new DPS autoindexing, plus refinement of direct beam position. Manual spot-finding option. Option to write multiple MTZ files (if processing while collecting data). Fit circles option. Bug fix for non-zero two-theta indexing. Many small bug fixes.

Changes since 5.51

New FFT based autoindexing algorithm from DPS added (Steller,Bolotovsky and Rossmann (1998) J. Appl. Cryst. 30, 1036-1040.) New detector types LIPS (large image plate scanner at ESRF) and SBC1 (Westbrook detector at APS) added.

Changes since 5.50

New detector type MARCCD for 135mm circular CCD detector from Mar Research. Allows partials to lie on up to 100 images (previous limit was 10) New keyword TEMPLATE for more general definition of image filenames. Generalise direction of two-theta axis (previously had to be parallel to fast changing direction in image). More changes to STRATEGY algorithm. Standardise FORTRAN so thet code will compile under Linux (this needs a new version of the autoindexing code...post 5/5/98)

Changes since 5.40

Improved strategy algorithms. New detector types (Mar345, ADSC CCD) A host of minor bug fixes, and changes to make the program easier to run.

Changes since 5.30

The major change to version 5.40 is that the code for the spot-finding program (IMSTILLS) and autoindexing (REFIX) has been incorporated into MOSFLM. A new menu for the X-window interface has been introduced, which allows the user to find spots, autoindex images, run the strategy option,refine cell parameters (using post-refinement) and integrate images interactively. (All these features are of course still available when running a background job). The new menu is invoked by using the "IMAGE" keyword to read in an intial image. The data collection strategy option has been available since version 5.30, but has been improved in the current version, particularly by the option to speed up the calculation by any desired factor and optimise anomalous data. Additional image formats have been added. The program can now handle images from Mar Research, Raxis (II or IV), Mac Science, Molecular Dynamics, Fuji and ESRF CCD detectors.

The following keywords are no longer necessary (but can still be given to override program defaults) :

RASTER
SEPARATION
GENFILE
HKLOUT
PIXEL

and for Mar Research, ADSC, Mac Science and RaxisIV detectors:

WAVELENGTH
DISTANCE

and oscillation angles.

(Note however that the header information is not always correct for Mar detectors at synchrotron sites, because the software controlling the spindle axis (and/or distance) does not communicate with the Mar software controlling the detector. Check this with the station manager)

See Appendix I for a more detailed list of major changes from earlier versions.


Help Library

This User Guide is not exhaustive in describing all options available. However all possible keywords are described in the help library file (mosflm.hlp) which is part of the distribution. The help library is an ascii file, and can therefore be read with an editor, or invoked by typing "HELP" at the mosflm prompt (==>) when runing the program interactively. Note that the environment variable "CCP4_HELPDIR" should point to the directory containing the help library. The bugs that prevented the online help working correctly have now been fixed.


Important notes

The source code as distributed contains the code for the FFT based autoindexing but NOT for REFIX. I can distribute the REFIX code by E-mail to academic institutions only, or to those who already have the Mar XDS software. Please send an E-mail to the address given above to get the REFIX autoindexing code.


Back to Contents Page

1: Overview

1.1 Programs covered in this guide 1.2 Input and Output files 1.3 Allowed detector types 1.3.1 Using the DETECTOR keyword 1.3.2 Using the SITE keyword 1.3 Allowed detector types 1.4 Inspection of images 1.5 Example input

1.1 Programs covered in this guide

Data processing falls naturally into three sections:

1) Determining the crystal orientation, cell parameters and possible space group.
2) Generating the reflection lists and integrating the images.
3) Scaling and merging the resulting data.

These notes will be restricted to topics (1) and (2), which are now both present in the MOSFLM program alone. The CCP4 program SCALA is strongly recommended for the third step (scaling and merging).

1.2 Input and Output files

There are several input and output files and it is crucial that the output files are given unique filenames when two (or more) processing jobs are being run from the same directory, or the results are very unpredictable!

Input files

1) The image file
2) [The file containing the crystal orientation matrices]
NAMING CONVENTION FOR IMAGES

It is assumed that the images conform to a naming convention where the image name is made up of three parts, a template, a three digit number and an extension. The template can be up to 40 characters long, and should be separated from the three digit number by a hyphen (-) or an underscore (_). The extension can be up to 8 characters long and should be separated from the three digit number by a period (.). Note that the template can contain underscores or hyphens.

Examples of valid image filenames are:

lysozyme_cryst1_021.image catx1_001.img f1_tray42_wellb6_001.osc
ALTERNATIVE NAMING CONVENTION

If the image filenames do not conform to the specification given above, the TEMPLATE keyword can be used to define a very general format for the image filenames. If the TEMPLATE keyword is to be used, it MUST preceed the IMAGE keyword in the input, and the image NUMBER, not the filename, should be given. See TEMPLATE for more information.

eg

TEMPLATE fred_### IMAGE 23 PHI 22 TO 23
will read the file "fred_023" (no filename extension).

Output Files

1) The output MTZ file containing the integrated intensities. Set with keyword HKLOUT or on the command line using HKLOUT. When not given, a default filename made up from the crystal identifier (or TEMPLATE) and the first image number is used. It is now possible to specify that multiple MTZ files are written during one integration run (subkeyword MULTIPLE on HKLOUT keyword. In this case, a separate MTZ file will be written for each block of images processed (see BLOCK subkeyword of PROCESS keyword). The filenames will be distinguished by _001, _002, _003 etc being appended to the specified (or default) filename. Eg lysx1_1to50_001.mtz, lysx1_1to50_002.mtz etc. This allows some of the data to be scaled and merged prior to the data processing finishing. (Binary)

2) If autoindexing or refining cell parameters a file containing the refined crystal orientation matrices is written. Filename set with keyword NEWMAT, defaults to NEWMAT. (Ascii)

3) The summary file. Contains a summary of processing results. Can be assigned on the command line (SUMMARY), defaults to SUMMARY. This file can be input to xloggraph for graphical representation.(Ascii)

4) When running interactively, all output written to the terminal window is also written to the file "mosflm.lp". (Ascii)

Temporary Files

1) The "Generate" file. Assigned with keyword GENFILE, defaults to the same as the MTZ filename but with the extension ".gen" instead of ".mtz". (Binary)

2) The measurement boxes file. Assigned on the command line using SPOTOD defaults to SPOTOD. Ths file can be very large, and should normally be assigned to a scratch disk and deleted as part of the command procedure.

3) A reflection coordinate list, assigned using COORDS on the command line. This is only produced when the "SEPARATION CLOSE" option is being used to process images with very closely separated spots.

Example of a command line:

ipmosflm HKLOUT lyso_srs.mtz  SUMMARY lyso_srs.sum \
                              SPOTOD /scr0/andrew/lys.spotod \
                              COORDS /scr0/andrew/lys.coords  << eof-ipmos
GENFILE /scr0/andrew/lys.gen 
NEWMAT lyso_srs.mat
....
....
eof-ipmos

1.3 Allowed detector types

The type of detector is specified either by the DETECTOR keyword, or by a SITE keyword, with the latter generally being used for synchrotron sites that use detectors that are not commercially available. At present, the follow detectors are allowed.

1.3.1 Using the DETECTOR keyword

Note that no special input is required to distinguish between the different types of Mar Research image plate scanner (18,30 or 34.5cm (Mar345)). The image size is read from the header record and the appropriate limits and pixel size are set up automatically.

Both unpacked and packed image formats are supported for the Mar345 scanners (no DETECTOR keyword required to distinguish these).

For Mar, Raxis and Mac Science scanners it is not necessary to specify the size of the image, as it is determined from the image header.

For offline scanners (FUJI and MD) it will also be necessary to define the orientation of the image relative to the X-ray beam and rotation axis, also using the DETECTOR keyword. See the help library (Subsection Novel detectors of DETECTOR) for details on how to do this.

If the rotation axis is reversed (usually a peculiarity of synchrotron sites) this can be dealt with by specifying: DETECTOR REVERSEPHI ..again, see the help library.

1.3.2 Using the SITE keyword

For CHESS, the station (A1, F1 or F2) and the detector must be specified. Possible detectors are the Gruner CCD detector working in 1K, 2K or 2K binned modes, the ADSC single module CCD detector (ADSC) the ADSC 2x2 CCD detector (QUANTUM4) and FUJI image plates. eg

SITE CHESS [A1 F1 F2] [FUJI [CCD [1K 2K 2KBINNED ADSC QUANTUM4]]]

For SSRL and ALS, the 2x2 ADSC detector is allowed: eg

SITE SSRL ADSC,  SITE ALS ADSC

1.4 Inspection of images, invoking the new menu

It cannot be emphasised strongly enough that images must be examined closely to check the following:

1) Does the crystal diffract ?

2) What is the effective resolution limit...should the detector be moved further back to take advantage of the full active area of the detector.

3) Is the crystal twinned, split, disordered etc ?

4) Is the exposure time long enough ?

Images can be displayed using the IMAGE keyword followed by the full filename of the image (including the directory if the image is not in the current directory). The only other keyword required specifies the type of detector (default is Mar Research image plate scanners). This will bring up the new menu interface which allows autoindexing, integration etc.

Note that MOSFLM displays the image viewed from the detector looking towards the source (cameraman's view), and also that the "fast changing" direction in the image is ALWAYS vertical in the display, regardless of whether it is vertical or horizontal in the actual detector. Thus some images will be rotated by 90 degrees.

example

IMAGE /fred/images/lysox1_001.image GO
or

DETECTOR RAXISIV IMAGE /fred/images/lysox1_001.image GO
In order to measure the resolution of individual spots on the images or display the resolution circles the wavelength and crystal to detector distance need to be given. For Mar Research, ADSC, RaxisIV and Mac Science detectors the wavelength and distance will automatically be taken from the header records in the image file, for other types of image the DISTANCE and WAVELENGTH keywords should be given, or the values set interactively using the X-windows interface.

Parameters that may need to be defined (and the appropriate keywords) are:

1) crystal to detector distance (DISTANCE)
2) Wavelength (WAVE)
3) Direct beam coordinates (BEAM)

EXAMPLE INPUT

NOTE: Input within square brackets is optional for Mar, ADSC, RaxisIV and Mac Science images. If a PHI keyword is given, this will override the phi values in the image header, and phi values in the header will be ignored for any subsequent image that is read in. The default DETECTOR type is MAR, so this need not be given for Mar images.

IMAGE catx1_001.img [PHI 0.0 TO 1.0]
BEAM 149.5 151.0
DETECTOR MAR (or SMALLMAR, MARCCD, ADSC, RAXIS, RAXISIV, SBC1, DIP2000,
             DIP2030, ESRF CCD, FUJI, MD)

[NEWMAT test_001.mat]   ! Defines the name of the file in which the results
                        ! of autoindexing or postrefinement will be written.
[WAVELENGTH 1.542]
[DISTANCE 250.0]
GO
This will invoke the X-window display, and a Menu list as shown below:

Read image        Read in another image.
Find spots        Find spots on the current (displayed) image.
Edit spots        Allows manual rejection of spots.
Save spots        Writes all spots to a file for use with old REFIX.
Clear spots       Deletes spots from display or list of stored spots.
Select images     If spots have been found on several images, allows
                  selection of images to be used in autoindexing.
Autoindex         Invokes autoindexing (DPS or REFIX).
Predict           Predicts spot pattern.
Clear prediction  Deletes predicted pattern from display.
Adjust            Adjust the fit between observed and predicted patterns.
Refine cell       Invokes a POSTREF SEGMENT run to refine cell parameters.
Integrate         Allows integration of images.
Strategy          Run the strategy option.
Keyword input     Allows keyworded input.
Find hkl          Allows a specified reflection to be identified.
Pick              Display pixel values
Measure cell      Measure cell parameters.
Circles	          Display resolution circles.
Fit circles	  Allows fitting to circles in image.

Exit              Close down X-windows display.

The various options in this menu list are discussed in the sections below. Note that there are some on/off and yes/no toggle boxes at the bottom of the "Processing parameters" window. These are described below:

Prompts              On/Off (default on)
When "prompts" is "on", additional information is given when some of the menu options are chosen. For experienced users, this additional information can be suppressed by turning the prompts "off".
Update display:
After refinement     No/Yes
After integration    No/Yes
By default, the display is updated each time a new image is read, and at no other time. By setting the "After refinement" toggle to Yes, the display will be updated after refinement of the detector parameters, so that it is possible to check how well the predicted pattern matches the image. If the "After integration" toggle is set to yes, each image will be display after it has been integrated, with "Bad spots" indicated and residual vectors (beteen observed and predicted spot positions) for fully recorded spots also shown. It is possible to reject additional reflections, or reclassify Bad spots, at this point.

Note that because images are integrated in "Blocks", during the actual integration of all images in a block the image that is displayed will be that of the last image in the block, unless the "After integration" toggle has been set to yes.
Timeout mode:        Off/On
If the Timeout mode is set to "On" during an Integration or Refine Cell run, then when each image is displayed the program will wait for 2 seconds for the user to select a menu option (it is best to start by turning the Timeout mode off if you want to do this). After this period (which can be changed with keyword "TIMEOUT") the program will just carry on. With the timeout mode "On" it is therefore possible to integrate a series of images without directly interacting with the program. This can be very useful if one just wants to keep an eye on the processing but do not want to keep hitting the "Continue" menu option.

Back to Contents Page

2: A Quick Guide

2.1 Startup keywords 2.2 Autoindexing 2.3 Estimating mosaic spread 2.4 Running the Strategy option 2.5 Determining oscillation angles with the TESTGEN option 2.6 Integrating the first image to determine if the exposure time is OK 2.7 Interpreting those "WARNING" messages 2.8 Getting accurate cell parameters 2.9 Integrating a block of images 2.10 Integrating the dataset

This is a brief guide on how to process data using the new menu options. For more details on each step in the procedure, see sections 3-6.

2.1 Startup keywords

Use the following keywords to bring up the image display and menu. These should be given at the MOSFLM => prompt:
=> TITLE My lysozyme data		! This title is transferred to the
					! MTZ file

=> IMAGE lyso_001.image [PHI 0 TO 1]	! Filename of first image. For Mar,
					! ADSC, Mac Science and RaxisIV images
					! the phi values will be taken from the
					! image header if not given here.
                                        ! If phi values are specified here,
                                        ! the values in the header will be 
                                        ! ignored for this an all subsequent
                                        ! images read in.

=> BEAM 150.0 149.0			! Direct beam coordinates


    If not a Mar Research IP scanner:

=> DETECTOR RAXISIV			! or RAXIS (for RAXIS II) or DIP2000 etc


    If not processing Mar Research, ADSC, R-axis or Mac Science images:

=> WAVE 0.91				! For Mar, ADSC, RaxisIV and Mac Science
=> DISTANCE 300			! this information is taken from the header
					! but can be overwritten using the 
					! keywords.

=> SYMM p43212			! If known, give cell and symmetry
=> CELL 79 79 38			! otherwise omit completely.


    Not essential for first stages, but needed for integration:

=> DIVERGENCE 0.1 0.03			! If isotropic, the beam divergence
					! can be included in the mosaic spread.
=> SYNCHROTRON POLARIZATION 0.9		! Defaults to 0.86 (SRS, Daresbury UK)
=> GAIN 1.7				! See section 8.1 for a way to
					! estimate the gain if not known.
=> GO
At this point, the image will be displayed with a list of "Processing parameters" on the far left (these can be changed by the user), a "Main menu" and beneath the Main menu a table of "Output" parameters.

2.2 Autoindexing

Select the menu option "Autoindex". The program will locate and display spots on the image. Parameters governing the spot finding are listed under *SPOT SEARCH* in the Processing Parameters table, but the program automatically sets suitable values for these parameters and they will not normally have to be changed. The image will then be autoindexed. The user will be prompted to supply a filename for the output orientation matrix. All queries have a default, which can be selected by simply entering carriage return.

When using the new DPS indexing, if a spacegroup and cell have been given, the cell parameters determined by the autoindexing will be permuted to best match the input values, but the user must still select the solution from the list provided. If using REFIX, the same information will force the image to be be autoindexed with this cell and no alternatives will be listed. If no spacegroup information has been given, for both algorithms the user will be presented with a list of choices, sorted on a "PENALTY" parameter (the lower the PENALTY the better). The user must select a spacegroup, and the cell is refined imposing that symmetry.

The success of the autoindexing can be checked by predicting the spots for the current image using the menu option "Predict". If not successful, try adjusting the intensity threshold "Min I/sig(I)" or the maximum cell length (for the FFT based algorithm) or read in another image (Read image menu option), find spots on it (Find spots) and repeat autoindexing (Autoindex). Spots from a satellite crystal can be removed using the Edit spots option.

2.3 Estimate mosaic spread

An approximate estimate of the mosaic spread should be obtained by predicting with different values of mosaic spread (set in Processing parameters) and seeing what value gives the best fit to the observed pattern.

2.4 Run Strategy option

Select the "Strategy" menu option. Input for this option has to be given at the MOSFLM => prompt in the terminal window.

MOSFLM => STRATEGY MOSFLM => GO
This will generate a reflection list, a unique reflections list, merge them and tell you what rotation range to use to get a maximally complete dataset.

If you then want to reduce the total rotation range (to save time) and still get a maximally complete dataset type the following at the STRATEGY => prompt:

STRATEGY => ROTATE 60 SEGMENTS 2 STRATEGY => GO
This instructs the program to find two 30 degree segements that give maximum completeness. You can try 3 segments (of 20 degrees) if you like, but this rarely (in my experience) gives significantly greater completeness and will take significantly longer. (Also don't forget that the more segments you have, the more unmatched partials you will get).

For orthorhombic space groups, you should also try STRATEGY ALTERNATE if the predicted completeness in not as high as expected.

2.5 Determine oscillation angles with the TESTGEN option

Having determined what rotation range needs to be collected, you can check what the (maximum) rotation angle is to avoid getting (too many) spatial overlaps on the images. Remember you must have realistic estimates of the mosaic spread and minimum spot separation for this to be meaningful. Also remember that for post-refinement to work, the oscillation angle must be more than half the sum of the mosaic spread and beam divergence.

At the STRATEGY => prompt type:

STRATEGY => TESTGEN
This will describe the possible keywords. If your data collection was in two segments of -15 to 15 degrees and 45 to 75 degrees and you want no overlaps type :

STRATEGY => TESTGEN START -15 END 15 STRATEGY => GO

and the program will calculate the MAXIMUM possible rotation angles for this range, at intervals of 5 degrees.

then type:

STRATEGY => TESTGEN START 45 END 75 STRATEGY => GO

for the second segment.

2.6 Integrate the first image to determine if the exposure time is OK

The best way to get an indication of the data quality is to integrate the first image and see how the mean <I>/sigma<I> varies with resolution. These values are always slightly optimistic, so you should aim to have a ratio of at least 3.0 in the outermost resolution bin. If it is lower than this, consider collecting data to a lower resolution (and moving the detector further back) or using a longer exposure time.

To get back to the Menu, type EXIT at the STRATEGY prompt:

STRATEGY => EXIT

Then select the "Integrate" menu option, and answer the questions (most can be answered by entering carriage return). However, now is the time to set the centre and radius of the backstop shadow, using the BACKSTOP keyword, eg:

BACKSTOP CENTRE 88 90 RADIUS 12

An alternative way of dealing with backstop shadows (particularly for an extended backstop) is to use the NULLPIX keyword. This is used to define a minimum pixel value, and any spot that has a pixel within its measurement box with a value lower than this minimum will be rejected.

The program will put up the predicted pattern and then wait for input (unless the timeout mode is set). If the pattern is a good fit, choose menu option "Continue". If the pattern is not aligned with the spots, choose the option "Adjust" and follow the instructions to align the pattern with the spots. (The usual reason for poor alignment is an error in the direct beam coordinates).

The image will then be integrated. Check the <I>/sigma<I> values, and the Rsym if you have symmetry related fully recorded reflections on this image. The value of SDRATIO is a better guide than the actual Rsym, as the latter will depend on the intensity of the reflections. The SDRATIO should lie in the range 1 to 3.

2.7 Interpreting those "WARNING" messages

After the integration, the program will usually print a list of "WARNING" messages. Don't worry about messages about the standard profiles at this stage, or large positional residuals (because the cell has not yet be accurately determined). However if the " OVERALL BACKGROUND RATIO (BGRATIO)" message is present, this suggests the detector GAIN may be wrong, and the input value need to be multiplied by the square of the value of the BGRATIO. Beware, however, that images showing diffuse scatter will give a high BGRATIO even when the GAIN is correct (eg up to 1.5).

2.8 Getting accurate cell parameters

Accurate cell parameters are essential for optimal integration of the images. MOSFLM uses a post-refinement procedure to determine accurate cell parameters. For trigonal or higher symmetry, an accurate cell can usually be determined from a single "wedge" of data (typically 3-5 degrees), unless the unique axis is approximately along the X-ray beam direction in which case either a different phi value should be used or two segments of data. For orthorhombic or lower symmetry two "wedges" of data widely separated in phi will give the best results. In the latter case, one can either wait until a large rotation range has been collected before refining the cell, or one can start by collecting a few degrees at (say) phi 85 to 90, then start collecting from phi = 0.

When the appropriate data (images) are available, select the "Refine cell" menu option. Answer the queries. Although the default number of images to use is 2 (in each wedge), this is in fact the minimum number and better results will often be obtained by using 3 or 4 images in each wedge.

It is important to have a realistic estimate of the mosaic spread before refining the cell.

Post-refinement yields very accurate cell parameters, but has a relatively small radius of convergence. If the shift in cell parameters exceeds 2.5 sigma, the integration of the images and the actual refinement will be repeated. This will happen up to a maximum of 5 times. It is not unusual for 2 or 3 complete rounds to be required if the initial cell parameter estimates came from auto-indexing a single image.

2.9 Integrating a block of images

The next step is normally to integrate a block of between 5 and 10 images. Use the "Integrate" menu option as before. In this case, pay particular attention to the list of warning messages to see if any parameters or options need to be reset. It is also a good idea to check the appearance of the standard profiles (these are output to the terminal window but also to the file "mosflm.lp"). Make sure that adjacent spots are being adequately resolved, and that the peak is not spilling into those pixels marked as background. The PROFILE TOLERANCE parameters are crucial in determining the appearance of the standard profiles. Also try to ensure that NO profiles are being averaged. If necessary, change the minimum rms variation in the background (PROFILE RMSBG) or the number of different profiles (by defining PROFILE XLINES and PROFILE YLINES) to avoid profile averaging. Check that there are not too many reflections being rejected as "BAD SPOTS". If a significant number of strong reflections are being rejected for "Poor profile fit", if the cell has been accurately determined and the GAIN is correct, consider increasing the rejection value (REJECTION PKRATIO) from its default of 3.5 to 4.0. It should NOT be necessary to increase this above 4.0

2.10 Integrating the dataset

Once a block of images has been successfully integrated, the complete dataset can be integrated. If data processing is started before data collection is complete, use the WAIT keyword to make the program wait for an image to be completed before it tries to integrate it.

eg WAIT 15 for 15 minute exposures.

You may also wish to specify multiple MTZ files (one for each "block" of images) so that some data can be scaled and merged in SCALA before data collection/processing has completely finished.

Set the "Timeout mode" toggle (at the bottom of the "Processing parameters" window) so that the program will automatically continue 5 seconds after displaying each new image.

Set the "Prompts" toggle to Off.


Back to Contents Page

3: Determination of crystal orientation, cell parameters and spacegroup

3.1 Autoindexing Interactively 3.2 Autoindexing when running the program in background

The crystal orientation, cell parameters and possible spacegroups are normally determined from a single rotation image (although 2, 3 or more can be used also). This will typically be a rotation of between 0.5 and 2 degrees, the value being chosen to avoid generating a significant number of spatially overlapped reflections. Autoindexing can be performed interactively or in batch mode, but spacegroup selection can ONLY be done interactively as the user is required to select a cell and spacegroup from a number of possibilities.

3.1 Autoindexing Interactively

3.1.1 Finding Spots

The first step in autoindexing an image is to locate the positions of diffraction spots. This can be done with the "Find spots" menu option, but if only one image is to be used for autoindexing one can go straight to the "Autoindexing" menu option.

3.1.1.1 Parameters used in Finding Spots

Parameters associated with spotfinding are listed in the "Processing parameters" window:

Threshold Rmin Rmax X offset Y offset Min X size Max X size Min Y size Max Y size Min no of pix X splitting Y splitting

All of these parameters have "sensible" defaults and normally they do not need to be changed.

Pixels are considered to be part of a spot if the pixel value is more than (Threshold*sigma) above the local background at that radius. The threshold is determined automatically by the program and will normally be appropriate.

The program searches for spots lying with radial limits of Rmin and Rmax (mm) from the direct beam position. The first step is to determine a radial background. The direction of this radial background is chosen to be at right angles to the rotation axis (to avoid any backstop shadow). It is normally centred on the direct beam position, but can be offset to one side by the "X offset" or "Y offset" parameters. If the radial background is along Y (the "fast" changing direction in the stored image), then use the "X offset" to change its position. If the default direction is along Y, and a value is entered for the "Y offset", this automatically changes the direction of the background strip to be along the X axis. Entering a negative value for either Rmin or Rmax will switch the background stripe to the opposite side of the direct beam position.

The position of the background strip is shown as a red rectangle on the display. If necessary its position should be changed to avoid any shadows or other unusual features on the image.

Note that the LIMITS EXCLUDE keywords can be used to exclude rectangular regions of the detector from spot finding (and integration).

The minimum and maximum spots sizes (in X and Y) are expressed as a multiple of the median spot size. If the image is very strong and the threshold is too low, then two adjacent strong spots may be treated as a single spot (because the pixel values do not go down to the threshold inbetween them). This problem can be avoided by either increasing the Threshold, or by decreasing Max X and Y sizes, as these spots will be almost twice as large as the average spot.

"Min no of pix" sets the minimum number of pixels that constitute a proper spot.

Split spots will be treated as a single spot if they are less than "X splitting" and "Y splitting" mm apart in X and Y. Normally this is not a problem with image plate data.

In tricky cases (eg very weak spots on a high background) it may be necessary to add spots manually. This is done by clicking on spots with the mouse. You do not need to be exactly on the spot, the program will search the area in the vicinity of the mouse and find the centre of gravity which will be used as the spot position.

3.1.1.2 Displaying found spots

The positions of the spots found are displayed as red crosses. Note however that ONLY spots which will be used for autoindexing (ie those with I/sig(I) greater than the threshold) are displayed. This is determined by the only parameter associated with autoindexing, the "Min I/sig(I)" parameter which follows the spot finding parameters in the "Processing parameters" window. This value defaults to 20.

It is a good idea to check that program is correctly locating the spots, and that in particular if the spots are very close and the image is strong, it is not treating two neighbouring strong spots as a single spot (in which case the red cross will come half-way between the two spots). If this is a problem, try increasing the "Threshold" or decreasing the "Max X size" and "Max Y size".

3.1.1.3 Editing found spots

If the crystal is not single, and the program finds spots that do not lie on the major lattice, it is a good idea to remove these spots. If the second lattice is much weaker than the main lattice, it may be possible to do this just by increasing the "Min I/sig(I)" parameter. If this does not work, select the "Edit spots" option from the main menu. Identify spots to be deleted by clicking on them with the mouse...this will result in an "X" being written over that spot. Be careful, as the mouse must be quite close to the spot position in order to reject the spot. When editing is finished, click on the "End edit" in the main menu.

The autoindexing algorithm is quite sensitive to the presence of "rogue" spots, so it is usually a good idea to reject them if the autoindexing is not successful.

3.1.2 Finding spots on other images

If you want to use more than a single image for the autoindexing (and this can provide successful autoindexing when using a single image fails) then read in another image using the "Read image" option in the Main Menu. The phi values will be read from the header (if they were not not given on the original IMAGE keyword) or set automatically, assuming the image is part of a contigous series, but the phi limits can be reset.

Then choose the "Find spots" option as described above. Note that there is a limit to the total number of spots that can be stored internally, which may place a limit on how many images can be used. Raising the spot finding "Threshold" will reduce the number of spots found if this causes problems. Note also that the REFIX autoindexing algorithm itself will use a maximum of 2000 spots.

3.1.3 Selecting images for autoindexing

If spots have been found on several images, then by default all of these spots will be used for autoindexing. If, however, you only want to use the spots from selected images, use the "Select images" menu option. The spots found on each image are stored in a separate "slot", and the "slot" numbers (rather than the image numbers) must be given when selecting the images (so that images with the same image number can be used). If you wish to make a fresh start, use the "Clear spots" menu option.

3.1.4 Running the autoindexing

Autoindexing uses either the FFT based algorithm from DPS (Stellar, Bolotovsky and Rossmann, (1998) J. Appl. Cryst. 30, 1036-1040) or Wolfgang Kabsch's REFIX program (Kabsch, 1988, 1993) both of which have been incorporated into the MOSFLM program.

Autoindexing is performed by selecting the "Autoindexing" menu item. The program will present a list of possible unit cells and space groups, sorted on the PENALTY of each solution, and the user has to select the appropriate choice. (When using REFIX, if a crystal symmetry AND unit cell have been applied then only solutions for this symmetry will be listed).

In this list, the first number is the number for that solution, the second number is a score for that solution (headed "PENALTY"). This is followed by the lattice type, the cell parameters and a list of possible spacegroups. Normally one would choose the solution with the highest possible symmetry, but which still has a reasonably low "PENALTY" (The LOWER the PENALTY the better).

example:

  18 150  cI   103.13   103.36   103.01    62.8  62.8  62.9  I23,I213,I432,
                                                             I4132
  17  64  tP    74.44    74.54    74.56    92.6  92.2  92.4  P4,P41,P42,P43,
                                                             P422,P4212,P4122,
                                                             P41212,P4222,
                                                             P42212,P4322,P43212
  16  63  oP    74.44    74.54    74.56    92.6  92.2  92.4  P222,P2221,P21212,
                                                             P212121
  15  63  tP    74.54    74.56    74.44    92.2  92.4  92.6  P4,P41,P42,P43,
                                                             P422,P4212,P4122,
                                                             P41212,P4222,
                                                             P42212,P4322,P43212
  14  62  hR   107.32   103.13   131.17    92.0  89.8 121.4  R3,R32
  13  41  oC   103.01   107.80    74.44    90.2  93.3  90.0  C222,C2221
  12  41  mP    74.54    74.44    74.56    92.2  92.6  92.4  P2,P21
  11  41  mP    74.56    74.44    74.54    92.4  92.6  92.2  P2,P21
  10  40  oC   103.01   107.80    74.44    89.8  93.3  90.0  C222,C2221
   9  40  mP    74.54    74.44    74.56    92.2  92.6  92.4  P2,P21
   8  26  cP    74.54    74.56    74.44    92.2  92.4  92.6  P23,P213,P432,
                                                             P4232,P4332,P4132
   7  24  mC   103.36   107.32    74.54    89.8  93.6  90.1  C2
   6  22  mC   103.36   107.32    74.54    89.8  93.6  90.1  C2
   5  19  aP    74.44    74.54    74.56    87.4  92.2  87.6  P1
   4   6  hR   107.32   107.52   123.58    89.9  90.0 119.8  R3,R32
   3   4  mC   103.36   107.32    74.54    90.2  93.6  89.9  C2
   2   2  mC   103.01   107.80    74.44    89.8  93.3  90.0  C2
   1   0  aP    74.44    74.54    74.56    92.6  92.2  92.4  P1

In this case R3 or R32 is an obvious choice.

Once you have made a selection the autoindexing is repeated automatically imposing the appropriate cell constraints. You are then given the choice of accepting that solution or trying another one from the list. REMEMBER that the true spacegroup can only be determined from reflection intensities, NOT from unit cell parameters.

BEWARE of monoclinic spacegroups with beta angles close to 90 being misclassified as orthorhombic etc.

The final orientation (A matrix) and cell parameters are written to a file, which can be defined when the autoindexing procedure is initiated, or with the keyword NEWMAT (defaults to NEWMAT).

The file can be read in (MATRIX keyword) in future processing jobs.

If you want to change (permute) the order of the cell axes, simply include a CELL keyword giving the cell that you would like to fit. For example, in orthorhombic space groups the autoindexing will select the cell with a < b < c. If the cell dimensions are 50, 100, 150 and you want to have a=150, b=50, c=100 give the keyword:

cell 150 50 100
before running the autoindexing.

3.1.4.1 How do you know if it has worked correctly ?

The most obvious test of the success of the autoindexing is to predict the pattern using the "Predict" menu option and see if it matches the observed pattern.

If there was a large error in the input direct beam coordinates, with the REFIX autoindexing this is sometimes apparent in a shift of the predicted pattern relative to the observed spots. This shift can be corrected using the "Adjust" menu option. With the DPS indexing, the direct beam cordinates are automatically updated, so it should not be necessary to "Adjust" the pattern. If the shift is significant, it is probably worth repeating the autoindexing with the updated direct beam parameters (they are updated automatically by using "Adjust") as this will give more accurate cell parameters.

The single most important number by which to judge whether autoindexing has succeeded is the positional residual (standard deviation of spot position). This value should be below 0.2-0.3mm. If it is above 0.3mm the solution is highly suspect, and if above 0.4mm it is almost certainly wrong. Values of 0.08mm to 0.12mm are typical for a correct solution (Note that the positional residual will depend on the size of the diffraction spots. The values given here are for a spot size of about 6x6 pixels with a pixel size of 0.15mm, larger spots will give slightly larger residuals).

3.1.4.2 Errors reported by the autoindexing

Possible options if the autoindexing fails are:

1) Make sure the direct beam coordinates are correct ! (The autoindexing is quite sensitive to these). If necessary, record a powder pattern (eg from bee's wax or parrafin wax) and display this image and work out the coordinates of the centre of the rings.

2) Try changing the intensity threshold "Min(I)/sig(I)" in the "Processing Parameters menu" (up or down).

3) For the new DPS indexing, change the maximum cell edge.

3) Include data from other images (this could also give a more accurate cell).

4) Try to avoid images looking down a principle zone.

5) If you know the cell and spacegroup, the original version of REFIX (which requires input of a cell and the spacegroup) sometimes works when the new one does not !! Use menu option "Save spots" to write out a spot list which can be input to the original version of REFIX.

****BEWARE**** The program writes a list of spots when running the autoindexing from within MOSFLM, but this file has a different format to the one written using the "Save spots" menu option, and CANNOT be read by the original REFIX so don't try it !

3.2 Autoindexing when running the program in background

(Note that as far as MOSFLM is concerned, a job that directs output to a file rather than to a terminal is considered to be a background or batch job).

It is only possible to autoindex in a background job if the spacegroup and cell are known and this can only be done with the REFIX autoindexing.

Autoindexing is invoked by including the keyword AUTOINDEX. If no images are specified, the first image to be integrated (specified on the PROCESS keyword) will be used for autoindexing.

Thus :

CELL 107 107 123 90 90 120 SYMM R32 AUTOINDEX PROCESS 1 TO 30 [ ANGLE 1.0 START 0.0 ]
will autoindex using image 1 and then integrate images 1 to 30.

Note that the cell derived from the autoindexing, rather than that given by the CELL keyword, will be used during integration. If the cell is known accurately it is usually better to override the cell derived from autoindexing by using the KEEP keyword:

CELL KEEP 107.73 107.73 123.59 90 90 120

If you want to include more than one image in the autoindexing they can be specified explicitly:

AUTOINDEX IMAGES 1 2 3

will use the first three images. In this case it is assumed that the phi values are read from the image header, or that these images form part of a contiguous rotation in phi.If this is not the case, the phi values can be specified explicitly:

AUTOINDEX IMAGE 1 PHI 0 1 IMAGE 20 PHI 50 51

If the image identifier (used to form the template for the image filename) is not the same for all images, it can also be specified explicitly:

AUTOINDEX IMAGE 1 PHI 0 1 IMAGE 20 PHI 50 51 IDENT test_2

Note that if PHI or IDENT are given, then only ONE image can be specified on each IMAGE keyword so that:

AUTOINDEX IMAGE 1 PHI 0 1 IMAGE 20 21 IDENT test_2 is NOT allowed

The "Min I/sig(I)" threshold can be set also:

AUTOINDEX THRESHOLD 30

Parameters asociated with spot finding can be set with the FINDSPOTS keyword:

eg:
FINDSPOTS THRESHOLD 10 RMIN 25 RMAX 75 SPLIT 0.5 0.5 FINDSPOTS MINX 0.5 MAXX 1.5 MINY 0.5 MAXY 1.5 XOFFSET 25


Back to Contents Page

4: Running the STRATEGY option

4.1 Overview of the STRATEGY option 4.2 Some Examples of the STRATEGY options 4.3 Determining the oscillation angle for each image (TESTGEN option)

Having determined the crystal orientation, one then needs to know what rotation range is necessary to collect a complete (or essentially complete) dataset. The STRATGEY option provides a very rapid and convenient means for doing this.

4.1 Overview of the STRATEGY option

The STRATEGY option allows the design of a data collection strategy in a semi-automatic way for a single axis rotation camera. It requires all the parameters normally used to process a set of images (crystal symmetry, orientation, crystal to detector distance, wavelength, detector type, direct beam position).

The rotation range (PHITOT) required to collect a complete dataset is determined from the crystal symmetry and orientation (eg 180 degrees for Laue group P2/m if rotating about the b axis, 90 degrees if rotating about a or c).

The phi value (PHIZONE) which, for orthorhombic (or lower) symmetry places an axis in the XZ plane (containing the X-ray beam and the rotation axis), or for trigonal or higher symmetries places the unique symmetry axis in the plane normal to the X-ray beam and containing the rotation axis, is determined. A reflection list corresponding to a total rotation of PHITOT starting at phi=PHIZONE is generated.

For orthorhombic spacegroups the algorithm used to calculate PHIZONE is not foolproof ! It works in approximately 90-95% of cases. When it does not work, the predicted completeness may be up to 3-4% less than what could be achieved using a different value of PHIZONE. If the predicted completeness is less than expected, try giving the ALTERNATE keyword as part of the STRATEGY command. This will use a different value for PHIZONE which may (rarely) give a higher completeness. As a rule of thumb, it should be possible to get at leat 90% completeness for a total rotation of 60 degrees in two segments.

To save time, the true unit cell will be "shrunk" when generating the reflection lists. This can be controlled by the SPEEDUP subkeyword, but the program will calculate a sensible default SPEEDUP if none is specified.

This reflection list is then compared to a list of all unique reflections for this spacegroup and the completeness and multiplicity is calculated, both as a function of rotation and resolution.

It is assumed that all possible reflections are measured (ie none are lost because of spatial overlaps or because they extend over too many images). However, some reflections may be unobserved because they lie in the cusp region. The percentage of reflections within the cusp will depend on the wavelength, crystal symmetry and crystal orientation, and can be minimised by trying to orient the crystal so that the crystal axis closest to the rotation axis is at least THETAMAX degrees AWAY from the rotation axis, where THETAMAX is the maximum Bragg angle.

It is often possible to collect data with a very high percentage completeness with a total rotation significantly less than PHITOT. This will inevitably result in a lowering of the overall multiplicity, but if data collection time is limited (for example at a synchrotron source) it is preferable to obtain a dataset with high completeness and less than optimal multiplicity rather than an incompete dataset with higher multiplicity ! Equally, if radiation damage is a serious problem, it is best to get a complete dataset first, and then collect additional images to increase the multiplicity.

If the total rotation angle to be collected is specified, and the number (up to 4) of discontinuous segments to be used, the program will determine the start and end phi values for each segment that will give the highest possible completeness. For example, a total rotation of 60 degrees in 2 segments for an orthorhombic spacegroup will result in the identification of two 30 degree segments which give the highest completeness.

If some data has already been collected from one (or more) previous crystals, the program will determine the starting phi value for the "current" crystal that will give the maximum completeness, with the assumption that the phi rotation for this crystal is such that the TOTAL rotation for ALL the crystals is PHITOT (This assumes that all crystals are mounted about the same axis). The user may also define the total rotation angle for the current crystal.

4.2 Some Examples of the STRATEGY option

Once an image has been autoindexed, select the "Strategy" option from the menu. The input for the STRATEGY option has to be given in the I/O window, initially at the MOSFLM => prompt.

4.2.1 No previous data has been collected

Enter the following keywords:
STRATEGY
GO

The program will determine the phi angle PHIZONE (see above), and generate a reflection list starting at that phi angle, for a total rotation determined by the Laue group. It will then generate a list of all unique reflections and merge the two lists. Finally it will give the completeness of the data for the rotation range generated:

 Optimum rotation gives  98.0% of unique data
 This corresponds to the following rotation ranges for the final run
 From   20.0 to 110.0 degrees
 Type "STATS" for full statistics
 ....
 STRATEGY =>
Typing STATS at the prompt will give a breakdown as a function of rotation angle and resolution, and a breakdown of the anomalous data.

If insufficient time is available to collect the full rotation range required, one can determine the best segments to collect to achieve maximum completeness. Type at the STRATEGY => prompt:

 STRATEGY => ROTATE 50 SEGMENTS 2
 STRATEGY => GO
(*** The ROTATE keyword MUST be given before the SEGMENTS keyword **) The program will then give the best phi ranges to collect for two segments, each of 25 degrees, giving a total rotation of 50 degrees:

 Optimum rotation gives  96.1% of unique data
 This corresponds to the following rotation ranges for the final run
 From   20.0 to  45.0 degrees
 From   65.0 to  90.0 degrees
One can try using 3 segments of data instead of two:

 STRATEGY => rotate 50 segments 3 
 STRATEGY => go
In this case, the result is:

 Optimum rotation gives  98.0% of unique data
 This corresponds to the following rotation ranges for the final run
 From   20.0 to  37.0 degrees
 From   52.0 to  69.0 degrees
 From   74.0 to  90.0 degrees
The effect of using other total rotations and different numbers of segments can also be tested (but using more than 3 segments is very time consuming and in fact there is an absolute limit of 4 segments).

Alternatively, the completeness of specified segments can be tested:

START 0 END 20
START 65 END 90
GO
Note that the phi ranges specified on the START and END keywords MUST lie within the phi range generated by the program when it first starts. Thus if, for example, the program has generated reflections from phi=10 to phi=100 then it not possible to try:

START 0 END 30
GO
or ROTATE 100 at the STRATEGY prompt (The program will complain).

4.2.2 When some data have already been collected

It is also possible to deal with the case where some data have already been collected (from the same or from other crystals).

a) Data from the same crystal

Consider the case where 30 degrees of data have been collected (from phi = -10 to phi = 20 say), and we want to determine how best to complete the dataset with an additional rotation of 40 degrees.

Select the "Strategy" menu option, and enter the following keywords:

STRATEGY START -10 END 20 PARTS 2 SPEEDUP 10 (Note use of SPEEDUP keyword)
GO
STRATEGY ROTATE 40 SEGMENTS 2
GO
The program will then find the phi values for the two segments (each of 20 degrees) which when combined with the 30 degrees of data already obtained will give the maximum completeness.

b) Data from different crystals

Imagine that data have been collected from phi = -20 to 15 from an orthorhombic crystal with an orientation matrix "xtal_1.mat"

A second crystal is mounted, an image collected and it is autoindexed to give an orientation matrix "xtal_2.mat". The STRATEGY option can now determine the best phi range for this second crystal to complete the data.

First, specify the orientation of the first crystal using "Keyword Input" and the keyword MATRIX:

MATRIX xtal_1.mat

Then autoindex the first image of the second crystal. This MUST be done AFTER the orientation matrix for the first crystal has been specified, because the orientation of the second crystal has to be referred to that of the first.

The select the "Strategy" menu option, and enter the following keywords:

MATRIX xtal_1.mat
STRATEGY  start -20 end 19 PARTS 2 SPEEDUP 10
GO
MATRIX xtal_2.mat
STRATEGY AUTO
GO
Normally the first crystal will have been collected starting at a zone. If this is NOT the case, it will probably be necessary to collect two segments of data from the second crystal to get complete data. This can be done by specifying "STRATEGY AUTO SEGMENTS 2" for the second crystal, and it may be advantageous to specify the sizes of the two segments. Thus if the first crystal was collected starting at 15 degrees away from a zone, for a total of 35 degrees, then the second crystal will need one segment of 15 degrees and another of 40 degrees (90-35-15) go get best completeness.

It is also possible to automatically find the best rotations(s) for a smaller total rotation. Once the program has come up with the STRATEGY => prompt (ie after it has found the best solution for a single 55 degree rotation in the above case) one can then type:

STRATEGY => PART 1                    ! Include all data from first crystal
STRATEGY => AUTO ROTATE 40 SEGMENTS 2 ! Use 2 segments (each 20 degrees for
				      ! second crystal
This means include ALL data that has already been collected (from -20 to +19 in the above example) and then determine the best phi values giving a total rotation of 40 degrees (in two 20 degree segments) from the second crystal.

4.2.3 Optimising anomalous data collection

To optimise the number of anomalous pairs rather than the completeness of the unique data simply include the subkeyword ANOMALOUS:

STRATEGY ROTATE 60 SEGMENTS 2 ANOMALOUS SPEEDUP 10

This will not necessarily be the same phi range(s) as that which maximise the overall completeness.

4.2.4 A complete list of the STRATEGY subkeywords

STRATEGY subkeywords

subkeywords: AUTO ROTATE SEGMENTS SIZES START END PARTS SPEEDUP ANOMALOUS

AUTO Determine the starting phi angle and the phi rotation required to give a complete dataset (if possible from a single crystal setting), and give statistics on completeness and multiplicity. Do NOT use START or END with the AUTO keyword. This is the default mode of running stratgey.

ROTATE <phirot> Only for use with the AUTO option. Restrict the total rotation to "phirot" degrees.

SEGMENTS <nseg> Only for use with the AUTO option. Allow "nseg" discontinous segments of data to give a total rotation of PHIROT degrees. Unless specified explicitly with the SIZES keyword (see below) the segments will have approximately equal widths in phi.

SIZES <size1,size2,size3...> The sizes for the "nseg" segments. If SIZES are given, then the "phirot" value given on the ROTATE keyword is ignored, and the total rotation is the sum of the SIZES. Default: Use approximately equal sizes with total "phirot"

START <phistart> END <phiend> As an alternative to AUTO mode, specify the start and end phi values to be used in generating the reflection list. Up to 10 different sets of START and END can be given on successive STRATEGY keywords. eg STRATEGY START 0 END 30 STRATEGY START 35 END 60 STRATEGY START 70 END 90

PARTS <nparts> If some data have already been collected (from the same or other crystals), set "nparts" to the total number of segments of data already collected plus one (which is the current crystal or segment whose phi range is to be determined). This need only be given on the first STRATGEY keyword.

SPEEDUP <n> Speed up the calculation by a factor "n".

ANOMALOUS Optimise anomalous pairs rather than completeness of data.

4.3 Determining the oscillation angle for each image (TESTGEN option)

The completeness analysis assumes that NO reflections are spatially overlapped. Providing that spots within a lune (ie in the same plane in reciprocal space) are not overlapping, spatial overlaps can usually be reduced to an acceptable level by an appropriate choice of oscillation angle for each image.

The TESTGEN option will calculate the maximum allowed oscillation angles as a function of the phi value for a given maximum acceptable percentage of overlapped reflections (which can be zero).

The determination of whether or not a reflection is spatially overlapped depends crucially on the mosaic spread, beam divergence parameters and the minimum allowed spot separation. The mosaic spread and the minimum spot separation can be reset at the STRATEGY prompt to test how critical these values are, using keywords MOSAIC and SEPARATION respectively.

Remember that for post-refinement to work, the oscillation angle must be more than half the sum of the mosaic spread and beam divergence.

Note that the TESTGEN keyword can be given at the STRATEGY prompt or at the normal MOSFLM prompt (without running the STRATEGY option).

Keywords:

TESTGEN subkeywords

subkeywords: START END STEP OVERLAP MINOSC MAXOSC ANGLE

START <phstart> Define the starting phi. This keyword MUST be given.

END <phend> Define the ending phi. This keyword MUST be given.

STEP <phstep> The optimum rotation angle will be calculated every "phstep" degrees between "phstart" and "phend". Default: 5 degrees

OVERLAP <x> The maximum rotation angle giving less than x% overlapped reflections will be calculated. Note that x is in PERCENT. Default 0%

MINOSC <rotmin> MAXOSC <rotmax> Only rotation angles between "rotmin" and "rotmax" will be considered. Default: rotmin 0.2, rotmax 5.0

ANGLE <oscang> If the ANGLE keyword is given, then the overlap for a fixed oscillation angle "oscang" is calculated between phi=phstart and phi=phend. No attempt to find the "best" oscillation angle is made.

example:

TESTGEN START 0 END 90 OVERLAP 3 MINOSC 0.5

Exiting the STRATEGY option

Use keyword EXIT to end the strategy option.

An Example command file when not using the X-window menu

In this case, the type of detector (MAR,SMALLMAR,RAXIS etc) has to be specified, and the crystal to detector distance and wavelength as these cannot be read from the image header.

STRATEGY AUTO
DISTANCE 80
DETECTOR SMALLMAR
MATRIX lyso_1.mat
SYMM 19
BEAM  90 90
DIVERGENCE 0.35 0.3
MOSAIC 0.2
SEPARATION 1.5 1.5
POLARISATION MONOCHROMATOR
WAVELENGTH  1.5418
RUN


Back to Contents Page

5: Determining Accurate Cell parameters

5.1 Using Post-refinement to refine the cell 5.1.1 Doing post-refinement interactively 5.1.2 Doing post refinement in background 5.1.3 What the program actually does 5.1.4 Using several segments or different crystals 5.1.5 Tips on post-refinement 5.1.5.1 Processing images showing strong diffuse scatter

The unit cell parameters are refined as part of the autoindexing, but in general not all the parameters will be well defined (in particular, the cell parameter along the X-ray beam direction is ill-determined). Improved values can be obtained by using two or more images widely separated in phi for the autoindexing. However, accurate cell parameters are best determined by post-refinement, for which it is necessary to have a number (at least two) of abutting oscillation images. To obtain accurate cell parameters for orthorhombic or lower symmetry spacegroups, it is essential to have data from two orientations widely separated in phi, but for trigonal or higher symmetry only one "block" of data is normally required.

A pragmatic procedure is as follows:

If more than about 15 degrees of data are available from a single crystal, or several crystals in approximately the same orientation (within 20 degrees) use the "Refine cell" menu option (or the POSTREF SEGMENT option if running in background) to get an accurate cell and then do NOT refine it during integration.

If less than 15 degrees is available, use the refined cell from the autoindexing in processing and try post-refinement using an angular wedge of data, but if this is unstable (large sd's or shifts from cycle to cycle) then fix the cell parameters, as the values from the autoindexing, while they may be in error, will be sufficiently accurate to process a "local region" in reciprocal space, ie up to 10-15 degrees from the starting phi value.

5.1 Using Post-refinement to refine the cell

Post-refinement uses the distribution of the intensity of partially recorded reflections over the two images on which the partial is recorded to refine cell parameters, orientation and mosaic spread. It has the distinct advantage that the derived cell parameters are entirely independent of all detector parameters (crystal to detector distance and detector orientation) and distortions (ROFF and TOFF) which, if inaccurate, can lead to significant errors in the cell parameters derived from autoindexing.

**** IMPORTANT ****

Post-refinement can ONLY use partially recorded reflections which extend over two images. Those which extend over three or more images CANNOT be used. Thus if the mosaic spread (plus beam divergence) is more than twice the oscillation angle, post-refinement will NOT be possible. Under these circumstances, a larger oscillation angle should be used, or alternatively the data can be processed in "blocks" of images, determining a new orientation for each block by autoindexing.

5.1.1 Doing post-refinement interactively

Having obtained the crystal orientation by autoindexing, choose the "Refine cell" menu option. You can then select the number of "segments" of data to use in the refinement, the first image and the number of images to be used in each segment. Note that there must be at least two images in each segment, but there is generally little to be gained from using a total of more than 5-10 images in ALL segments (unless there are only a few partials on each image).

Note that when using data from two segments widely separated in phi, it is possible that the crystal orientation will have changed sufficiently that the orientation matrix for the first segment of data does not accurately predict the first image of the second segment. This can be quickly checked by reading in this image ("Read image" menu option) and then predicting the pattern ("Predict"). If the prediction is poor, there are two things that can be done. Either find spots on this second image and use them (together with the spots from the first image of the first segment) to repeat the autoindexing. This may give a matrix that predicts both images successfully. This should work unless the crystal orientation has genuinely changed between the two images (or the rotation axis is not normal to the X-ray beam). If this doe not work, you should derive a new orientation matrix for image from the second segment image. Remember to change the name of the file that the orientation matrix will be written to. REMEMBER to delete all the spots used to autoindex the first image if you have not already done so. Then use the "Autoindex" option to get an orientation matrix for this image. Under these circumstances, it is best to FIX the cell parameters for the autoindexing to those determined for the first segment of data (not possibel for DPS indexing). This is because only a single set of cell parameters is allowed (for all segments) when doing the post-refinement. The "Refine cell" procedure allows you do define a separate orientation matrix for each segment of images.

Because the post-refinement uses partially recorded reflections, it is important to have a realistic estimate of the mosaic spread BEFORE starting post-refinement. In particular, if no value has been supplied (ie the mosaic spread is zero) the program will issue a warning message because it is unlikely that the post-refinement will work. The simplest way to obtain an initial estimate of the mosaic spread is to "Predict" the pattern for several values (eg 0.1,0.2,0.3,0.5,0.75,1.0 etc) and see which value gives the best fit to the observed pattern. The postrefinement will give a refined estimate of the mosaic spread.

5.1.2 Doing post refinement in background

To use this option, the keyword :

POSTREFINEMENT SEGMENT <number of segments>

should be used, followed by PROCESS keywords defining the images to be included in each segment, with each PROCESS (see 5.3.1) keyword followed by a RUN keyword.

Example:

NEWMAT postref_3seg.mat      ! Defines the filename for the new matrix
POSTREF SEGMENT 3 
PROCESS 1 3 [ ANGLE 1.0 START 0.0 ] 
RUN 
PROCESS 43 45 [ ANGLE 1.0 START 42.0 ] 
RUN MATRIX test_88.mat 
PROCESS 86 88 [ ANGLE 1.0 START 85.0 ] 
RUN

Would use 3 segments of data (with phi values 0-3,42-45,85-88.) Note that a new MATRIX keyword has been given for the last segment, which could be necessary if the crystal has slipped during data collection. See section 5.1.1 for the best procedure to use when deriving an orientation matrix for the second or subsequent segments.

Note that the procedure uses only partially recorded reflections, and so in this case it would use partials then span images 1 and 2, 2 and 3, for the first segment etc. For this reason the PROCESS keyword MUST specify at LEAST 2 images.

PROCESS 1 1 ANGLE 1.0 START 0.0 would provide NO data for post-refinement.

5.1.3 What the program actually does

During postrefinement, the images are not fully integrated (only the intensities of partially recorded reflections are measured, and by summation integration rather than profile fitting) so there is no output generate file or MTZ file. The crystal orientation will be refined for every image independently, but the cell parameters will only be refined once the final segment of data has been processed

(Note that the very last image (88 in the example above) is apparently (from the logfile) not measured at all...this is NOT an error, since the intensities of the partials at the start of image 88 are obtained while processing image 87.)

If the cell parameters change by more than 2.5 standard deviations from the input values, all images will be remeasured using the updated cell and another round of cell parameter post-refinement will be carried out. This will happen up to a maximum of 4 repeats. It is quite common that two or even three complete rounds of integration are required for convergence. For this reason it is not a good idea to include too many images in the refinement. A target of between 500 and 2000 reflection in the refinement is perfectly adequate.

It is recommended that the final cell parameters are then used to integrate all the images in the dataset, fixing the cell parameters in the post-refinement:

POSTREF FIX ALL

5.1.4 Using several segments or different crystals

Note that if the crystal has been slipping during data collection, it is possible to provide different MATRIX keywords for each segment of data, and supply a new orientation (eg derived by autoindexing the first image of the segment). When doing this, the orientation matrices for all segments (including the first) SHOULD BE OBTAINED FROM THE SAME INTERACTIVE RUN OF MOSFLM. This ensures that the matrices for the second and subsequent segments are all referred relative to the orientation matrix for the first segment. It is also a good idea to FIX the cell parameters when autoindexing the images from the second and subsequent segments, as only one set of cell parameters is allowed when refining the cell by post-refinement. It is also possible to provide new crystal identifiers for each segment (eg if the crystal has been translated and the images given a different identifier). It is also possible to use data from different crystals, but in this case there is the restriction that the orientation of the crystals must be the same (to within 20 degrees) and the relative phi values must be correct. Providing the different crystals are all indexed in the same run of MOSFLM, the relative phi values are taken care of automatically.

A possible complete example is then:

TITLE  Refine cell with 3 segments
DIVERGENCE 0.35 0.2
SYMMETRY 96
[DISTANCE 124.1]
[WAVELENGTH 1.542}
DIRECTORY /scr0/andrew/
BEAM 89.33 90.10
GAIN 1.2
NEWMAT postref_3seg.mat
POSTREF SEGMENT 3
IDENT oval1
MATRIX oval1.mat
PROCESS 1 3 [ANGLE 1.0 start 0.0]
RUN
IDENT oval2
MATRIX oval43.mat
PROCESS 43 45 [ANGLE 1.0 START 42.0]
RUN
IDENT oval3
MATRIX oval86.mat
PROCESS 86 88 [ANGLE 1.0 START 85.0]
RUN

When doing post-refinement, the crystal orientation around the X-ray beam direction (the X axis) is not defined (the refinement is based solely on the observed degree of partiality and not on the positions of the spots) and this parameter is therefore not refined, but missetting angles around Y and Z axes are refined (see Appendix IV for a definition of coordinate frames). The refinement of the detector parameter CCOMEGA allows for crystal slippage around the X-ray beam direction.

If only a narrow angular wedge of data is available for a low symmetry spacegroup (orthorhombic or lower) it is possible to FIX cell parameters that are not well defined (those closest to the direction of the X-ray beam)

eg POSTREF FIX A

5.1.5 Tips on post-refinement

In the great majority of cases the post-refinement will provide accurate cell parameters without any user intervention (providing the mosaic spread estimate is realistic). There are, however, some special cases where additional input is required to get the best results.

5.1.5.1 Processing images showing strong diffuse scatter.

It is not uncommon to observe diffuse scatter on the images, particularly for data collected at a synchrotron source. Sometimes this takes the appearance of a "halo" around the Bragg spot, because the intensity of some types of diffuse scatter peak at the positions of the Bragg reflections. This can cause difficulties in post-refinement, because it has the same effect as a crystal with a very large mosaic spread. Under these circumstances, it is best to refine the cell parameters using spots that are close to half-recorded, as the refinement is then less sensitive to the model for the "rocking curve". The minimum and maximum fraction recorded can be specified as shown below:

POSTREF FRMIN 0.4 FRMAX 0.6

will only use reflections that are between 0.4 and 0.6 recorded. (Default is 0.1 to 0.9).


Back to Contents Page

6: Collecting data and processing the images

6.1 Overview 6.2 Special MOSFLM features 6.2.1 Accumulating profiles over several images 6.2.2 Addition of partials (ADDPART) 6.2.3 Post-refinement of orientation and cell parameters 6.2.4 Optimisation of measurement box parameters 6.3 Running a processing job 6.3.1 Running MOSFLM interactively 6.3.2 Processing the first block of data) (Non-interactively) 6.3.3 Finally, Processing the dataset

6.1 Overview

Before starting the serious data collection, integration of one or more images should be carried out to determine:

a) Is the crystal single?
b) Is the exposure time correct?
c) Is the crystal to detector distance correct (ie the whole of the detector is being used)?
d) Can the images be processed...are the spots separated and is the number of spatial overlaps small?

6.2 Special MOSFLM features

There are 4 features of MOSFLM which are unusual and require explanation. These are:

  1. Accumulation of standard profiles over several images.
  2. Addition of partially recorded reflections over adjacent images.
  3. Post-refinement of cell parameters and crystal orientation.
  4. Optimisation of the measurement box parameters.

6.2.1 Accumulation of profiles

In order to form well defined standard profiles (which are then used to evaluate the profile fitted intensities) fully recorded (or partially recorded) reflections over several images are added together. This improves the signal to noise and results in a better determined profile. The number of images used to form the profiles (usually between 5 and 10) is determined automatically by the program (in a way that avoids having just a few images in the final block). It can also be set manually by the BLOCK subkeyword on the PROCESS keyword line.

The positional refinement for all images in a block is carried out prior to forming the standard profiles and integrating the images. Thus each image is processed in two passes, the first pass for the positional refinement and writing all the "measurement boxes" for the spots to the SPOTOD file, and the second for actually evaluating the reflection intensities.

6.2.2 Addition of partials (ADDPART option)

The program has the option to add together the measurement boxes of the two halves of partially recorded reflections on adjacent images, thus giving the equivalent fully recorded reflection which can then be used to form standard profiles or for positional refinement of the detector parameters.

To make use of this option, the keyword:

ADDPARTials

should be given. (The default is now NOT to add partials).

Note that this procedure, which involves adding pixel values on two adjacent images, involves two assumptions:

Assumption 1

That the images have the same effective exposure time (ie total incident flux). If the rate of rotation of the spindle axis is determined by the ionisation chamber reading (as it may be on the MAR detector) then this assumption should be met. If not, then there may be an error introduced by this procedure especially if the incident beam is rapidly decaying (eg on an unstable synchrotron source). If post-refinement is being used (and by default it is used) then the program will print a warning message (to the summary file and the end of the logfile) if the exposure varies by more than 5% from one image to the next (as judged by the X-ray background).

Assumption 2

That the detector origin, orientation etc is identical for successive images, and that the images are exactly abutting (ie no overlap in rotation angle). These conditions will normally be met by the Mar, R-axis and Mac Science scanners, but mechanical wear can lead to the scanner not locking into the correct "home" position after a scan (it does one more or one too few rotations). This will show up as a variation in the ROFF distortion parameters in units of one pixel (0.15mm). The program keeps track of variations in ROFF,TOFF and CCOMEGA and will give a warning message if undue variation is detected.

**** IMPORTANT *****

If either of these assumptions is not met (this will be indicated by warning messages) then the ADDPART option should not be used.

With ADDPART, what are actually partially recorded reflections over 2 images are reclassified as fully recorded when stored in the MTZ file and they will therefore be used in scaling (SCALA or ROTAVATA). However, summed partials do carry a special flag, so that they are still classified as partials in the statistical analysis in SCALA or AGROVATA. Thus information on partial bias, for example, is still available.

Because of the ability of SCALA to scale data when there are no fully recorded reflections, the use of this option less important than it once was. Because its use depends on the assumptions listed above, which may not always be met, the DEFAULT is now NOT to add partials.

6.2.3 Post-refinement of cell parameters and crystal orientation

By default the program will refine both cell parameters and crystal orientation using post-refinement during integration of the images. However, it is in fact preferable to determine accurate cell parameters prior to integration using the Refine cell menu option for interactive work or the POSTREF SEGMENT option in a background job. The resulting cell parameters are then input using a CELL keyword and the cell is NOT refined during integration (by using keywords POSTREF FIX ALL). This will refine the crystal orientation (and mosaic spread) but not cell parameters.

If cell parameters are refined in a processing job, the way in which the refinement is carried out depends on the crystal spacegroup. For crystals of trigonal or higher symmetry data from each pair of images in turn is used in the refinement. (This is equivalent to the POSTREF SINGLE mode.) Thus cell parameters, crystal orientation and mosaic spread are refined after every image using intensities on that image and the next one in the series. (For off-line scanners, reflections on the current image and the preceeding image are used).

For lower symmetry this is not recommended, because not all the cell parameters will be well defined using data from only one pair of images. Thus for orthorhombic and lower symmetries data is accumulated from a number of images and only then will cell parameter refinement be carried out (the crystal orientation is still refined after every image as this is well defined). By default, the number of images required for the cell parameter refinement (NADD) is set to correspond to a rotation of 10 degrees. However, this can be changed using the WIDTH subkeyword. Thus:

POSTREF WIDTH 15

specifies that 15 degrees of data must be accumulated for post-refinement of cell parameters. The actual WIDTH of data required for a satisfactory refinement will depend on the resolution (the higher the resolution, the fewer images are required) and the strength of the data (the weaker the data, the more images are required). Some experimentation may be required to find a WIDTH that gives a stable refinement. If the refinement appears unstable (ie large shifts in cell parameters) the WIDTH should be increased. If this is not possible (eg only a limited number of images have been obtained from the crystal before radiation damage set in) then the refinement of some or all cell parameters should be turned off. Thus

POSTREF FIX ALL

will fix all cell parameters.

POSTREF FIX C

will fix the "c" cell parameter etc. Normally one would fix the cell parameter that is closest to being parallel to the X-ray beam as this will be the least well defined. Alternatively, look at the standard deviations of the cell parameters to see which one(s) are least well defined. Normally the cell parameters obtained from autoindexing are quite adequate to measure 10-20 degrees of data from the image on which the autoindexing was run.

Once the appropriate number of images (NADD) have been processed, and the cell parameters have been refined for the first time, if there is a large shift in any cell parameter the program will start processing from the first image again, using the updated cell parameters. (The maximum shift allowed is determined by the subkeyword SHIFTFAC..thus

POSTREF SHIFTFAC 5

sets the maximum shift to 5 standard deviations...if larger than this the images will be reprocessed. The default value is 2.5.

From this point on, the cell parameters will be re-refined after every image, using data from the previous NADD images. For example, with 1 degree oscillation images and a width of 10 degrees, the first cell refinement will be carried out after processing image 10, using data from images 1 to 10. After processing image 11, cell parameters will be refined using data from images 2-11 etc. etc.

The missetting angles should ALWAYS be refined by post-refinement, but it may be necessary in some cases to suppress or limit refinement of cell parameters if the refinement is not stable.

The crystal mosaic spread is also refined by default, but the refined value IS NOT USED BY DEFAULT. This is because if the refinement is unstable, this can have rather drastic effects on the processing. If the refinement is stable, and there is evidence for a change in mosaic spread during the run (this often results from radiation damage), the refined values should be used by including the subkeyword USEBEAM:

POSTREF USEBEAM

If you wish to refine the horizontal and vertical beam divergence independently (good data is required to do this) use BEAM 2 :

POSTREF BEAM 2

Again, you need to include USEBEAM to actually make use of the refined values.

6.2.4 Optimisation of measurement box parameters

By default the program will automatically determine the best measurement box parameters. It will first determine the spot size from spots in the centre of the image (parameters for this search are set by keyword SPOT). This information is used to set initial sizes for the overall dimesnions (NXS,NYS in figure below) and the corner and rims parameters (NC,NRX,NRY). Following detector parameter refinement using spots from the centre of the first image, the program will then optimise the rim and corner parameters NRX,NRY and NC.


            <------------------ NXS = 23 --------------->

       ^    - - - - - - - - - - - - - - - - - - - - - - -  ^
       !    - - - - - - - - - - - - - - - - - - - - - - -    NRY =2
       !    - - - - - -                       - - - - - -  ^
       !    - - - - -                           - - - - -
       !    - - - -                               - - - -
       !    - - -                                   - - -
       !    - - -                                   - - -
       !    - - -                                   - - -
    NYS =17 - - -                                   - - -
       !    - - -                                   - - -  ^
       !    - - -                                   - - -  !
       !    - - -                                   - - -  !
       !    - - - -                               - - - -  !
       !    - - - - -                           - - - - -  NC =8
       !    - - - - - -                       - - - - - -  !
       !    - - - - - - - - - - - - - - - - - - - - - - -  !
       ^    - - - - - - - - - - - - - - - - - - - - - - -  ^
            <NRX> = 3

Figure 1. The measurement box used in MOSFLM. NXS and NYS (odd integers) define the overall size of the measurement box in pixels. NRX and NRY define the widths of the background rim and NC defines the corner cutoff. In the figure a "-" denotes a background pixel, all other pixels belong to the peak. There is no "safety rim" between peak and background.

The algorithm employed is as follows. The parameters NRX, NRY, NC are varied in turn and the value giving the highest ratio of the integrated intensity I to the standard deviation in the intensity sigma(I) is found. This is the notional optimum value for that parameter. The total intensity for this value is checked, and if it is less than the maximum intensity (for any value of the parameter) by more than a factor 0.01 (TOLERANCE), then that parameter is decreased by up to IBOUND pixels.

For example, the max I/sigma(i) might be found for a X rim value of 4, but the intensity might be only 97% of the maximum intensity found for any value of NRX (eg 1). NRX will therefore be decreased, one pixel at a time, and for each value the integrated intensity tested against the maximum value. If for NRX=2, the intensity is within 0.01 (ie 1%) of the maximum value, then 2 will be taken as the optimal value for NRX.

Thus it can be seen that the higher the TOLERANCE parameter, the SMALLER the optimised peak area will be. It is difficult to define a "correct" value for TOLERANCE, because this will depend on the degree of diffuse scatter associated with the Bragg peaks and how well adjacent spots are resolved. Values between 0.01 and 0.04 are typical, it should NOT be necessary to use values above 0.04. Note that two values may be supplied to the TOLERANCE keyword. In this case the first value is used for profiles in the centre of the image (closest to the direct beam) and the second value for the outermost profiles. An interpolated value is used for other profiles. The default value for the innermost profile is 0.01. For very close spots this can be increased to 0.02 (rarely larger).

For the first round of this iteration, when optimising NRX and NRY, NC is set to zero. NRX and NRY are varied from 1 to a maximum value which would give a peak dimension of 5 pixels. NC is varied from the smaller of NRX and NRY up to the smaller of (NX-2) and (NY-2).

Two rounds of optimisation are performed, the second round using the results of the first.

The optimisation is first carried out on the average spot profile for the central region of the detector. The optimisation of the overall dimensions (NXS,NYS) is only carried out at this stage. The background rim parameters are optimised for ALL the standard profiles.

The background rim parameters are reoptimised for each new BLOCK of images.

If the optimisation of the standard profiles causes problems (because of unusual spot shapes with long tails or other features) it can be suppressed using keywords:

PROFILE NOOPTIMISE

In this case the measurement box parameters will still be optimised for the average spot profile using spots from the centre of the detector (this profile is NOT used for integration). This box will then be expanded automatically to allow for the increase in spot size due to obliquity of incidence on the detector, but the measurement box parameters will NOT be optimised for the standard profiles (one for each area of the detector) that are used for integration. To suppress the optimisation of the measurement box parameters altogether (so that the program will use the parameters supplied on the RASTER keyword), give the keywords:

PROFILE NOOPTIMISE ATALL FIXBOX

To just suppress the optimisation of the overall size of the measurement box (parameters NXS,NYS) include keywords:

PROFILE FIXBOX

6.3 Running a processing job

Normally at this stage one would go straight on to process the first BLOCK of images (See "6.3.2 Processing the first block of data" below).

6.3.1 Running MOSFLM interactively

The following keywords might be used for an interactive run. Before starting MOSFLM, it is convenient to edit them into a file (eg comm) to save typing them several times. Then at the mosflm prompt simply type:

@comm
and it will read the commands from the file

TITLE Processing test data ! Crystal parameters MATRIX oval_2_3.mat SYMMETRY 96 MOSAIC 0.1 ! image parameters IDENT oval PROCESS 1 TO 1 [ ANGLE 1.0 START 0.0 ] DIRECTORY /scr0/andrew/ EXTENSION image ! beam parameters DIVERGENCE 0.35 0.2 POLARISATION MIRRORS ! detector parameters BEAM 89.33 90.10 BACKSTOP CENTRE 88.5 91 RADIUS 14 GAIN 1.2 ! The following are read from the image header if not supplied on ! keywords for Mar, ADSC, RaxisIV and Mac Science. The phi values (on ! PROCESS keyword) will also be read from the header if not supplied. DISTANCE 124.1 WAVELENGTH 1.542
The following are optional, if not given, the program will set suitable defaults.
!HKLOUT oval.mtz !SEPARATION 1.3 1.3 !RASTER 17 17 9 3 3 !GENFILE oval_1to1.gen !RESOLUTION 3.0 PLOT RUN

*** NOTE WELL ***

If the data has been collected at a synchrotron source, the polarisation of the beam and the horizontal and vertical divergences (horizontal here means in the plane of the X-ray beam and the rotation axis) should be given. Values default to those for the SRS at Daresbury, UK.

The TITLE is written to the mtz and generate files.

MATRIX is the filename for the orientation matrix derived from a previous autoindexing or cell parameter post-refinement run. If you want to override the cell parameters or missetting angles in this matrix use the CELL and MISSETTS keywords respectively.

SYMMETRY gives the space group of the crystal, either as the name or the number in International Tables. Note that axial systematic absences (eg 0k0 with k odd in spacegroup P21) are measured by MOSFLM, so that the symmetry can be checked. Lattice absences however (due to face or body centring) will NOT be measured.

The IDENTifier (oval) is used as a template for the image file names, which have the form:

oval_003.image

for image number 3 for example. There is an absolute limit of 40 characters for IDENT. The extension (image in this case) is set using the EXTENSION keyword.

The template keyword is an alternative to the IDENT keyword. In this case:

TEMPLATE oval_###.image

would work identically.

PROCESS gives the images to be generated, in this case from image 1 to image 1 (ie only one image is going to be examined), with a rotation angle of 1.0 degrees, starting at phi=0 (relative to the phi values given in autoindexing). Note that for Mar Research, ADSC, RaxisIV and Mac Science scanners, the phi values need not be given, they will be taken from the image header.

DIRECTORY gives the directory name where the images are stored (up to 10 different directories can be given on one or more DIRECTORY keywords).

EXTENSION defines the extension of the image filenames (default is "image" for Mar Research scanners, "osc" for R-axis, "ipf" for Mac Science, "cor" for ESRF CCD, and "image" for unrecognised detectors.

DIVERGENCE... if two values are given, they are the horizontal and vertical beam divergences (which will differ if a monochromator is used). Only a single number need be given for isotropic divergence. "Horizontal" in this context actually means in the plane containing the rotation axis and the X-ray beam, which is horizontal in the case of the Enraf Nonius oscillation camera and the Mar Research IP scanners, but vertical for standard Raxis machines.

POLARISATION specifies the polarisation of the X-ray beam, and can be given as PINHOLE or MIRRORS (both specify an unpolarised beam), MONOCHROMATOR (for a GRAPHITE monochromator) or SYNCHROTRON followed by the degre of polarisation (0.86 for SRS).

BEAM defines the direct beam position, in mm, relative to the position of the first pixel in the image. This can be determined by taking a wax image (or plasticine) and measuring the centre of the circles using the X-windows display.

Special note for Raxis II scanners: The definition of the detector coordinates in the R-axis software is different to that adopted in MOSFLM. Thus if the direct beam coordinates have been obtained from the R-axis software, then the X and Y coordinates must be interchanged.

BACKSTOP Defines the centre and radius of the backstop shadow. Reflections lying within this circle will not be integrated. The position and size of the shadow are best determined using the X-windows display.

GAIN defines the gain (adc units per X-ray photon) of the detector. This should be constant for a given detector (and a fixed wavelength). It is CRUCIAL to have a reasonable estimate of the gain as many aspects of the program use counting statistics derived standard deviations to determine acceptable spots for refinement, profiles etc. The correct value of the gain should give a BGRATIO of unity (the BGRATIO is printed as part of the MOSFLM output, as a function of intensity). The actual value of BGRATIO obtained can be used to get an improved estimate of the gain using the relationship:

true gain = estimated gain * (bgratio)**2

The gain is typically between 1 and 2, but can be as high as 5 for Raxis II detectors. See section 8.1 for a way of estimating the GAIN of your detector.

DISTANCE is the crystal to detector distance in mm.

WAVELENGTH is the radiation wavelength (defaults to 1.5418).

HKLOUT gives the name of the output mtz file. This contains the Lp corrected intensities and standard deviations, with the reflection indices reduced to the asymmetric unit. This file can be used (after sorting) as input to the CCP4 programs SCALA (or ROTAVATA and AGROVATA). HKLOUT can also be given on the command line, but if both are given that specified by the keyword takes precedence. If not given, the filename is made up from the image identifier and the first image number, so in this case would be "oval_001.mtz".

SEPARATION (in mm in detector directions X and Y, ie horizontal and vertical for the Mar IP scanner) gives the minimum allowed separation of two spots before they are flagged as spatially overlapped (spots are treated as being ellipsoidal in shape, the numbers given being the full axial lengths in the X and Y directions). No attempt is made to integrate spatially overlapped reflections. If not given, the program will work out suitable values based on the spot size in the centre of the image. It will also determine if the "CLOSE" option needs to be used (see SEPARATION in the helpfile for more details). For spots that are not, in fact, completely resolved, the values determined by the program may be too conservative and lead to a very large number of spots being rejected as overlapped. In such cases, the SEPARATION should be defined explicitly.

RASTER gives the parameters of the measurement box (see Fig. 1 above). As explained in section 6.2.4 above, these values will be determined automatically if not supplied.

GENFILE specifies the generate filename. If not given, it will default to the MTZ filename, but with the suffix ".gen"

RESOLUTION: If not given, the resolution is set by the physical size of the detector. Both high and low resolution limits can be given. A "dynamic" high resolution limit, which depends on the mean I/sig(I), can be set using the CUTOFF keyword. A specific resolution range can also be excluded (eg to eliminate an ice ring) with the EXCLUDE keyword.

PLOT invokes the X-window interactive graphics option

RUN will start the processing

The image will be displayed with the predicted pattern overlaid. Fully recorded reflections are displayed as blue boxes, partials as yellow boxes, spatial overlaps as red boxes and reflections rejected as being too wide in phi as green boxes. Clicking (left mouse button) on the centre of a box will result in the reflection indices, phi value and phi width being displayed in the "Output" window. If examining the image after integration, the profile fitted intensity and standard deviation will also be given. See below for more details on examining the image after integration.

The image should be examined carefully to ensure that the predicted pattern actually matches what is on the image. If it does not, then any of the relevant parameters (cell dimensions, missetting angles, beam divergence, etc etc) can be adjusted and the pattern re-predicted (use the predict menu item). A brief description of the various menu options is given below:

Predict If the cell dimensions or any other parameter in the "Display Parameters" window is changed, selecting this menu item will re-calculate the predicted reflections and display them.

Clear prediction This will delete the predicted pattern. It can be restored by choosing the "Predict" menu item.

Adjust If there is an error in the beam coordinates or camera constants the calculated pattern will be displaced relative to the observed pattern. The "Adjust" option allows this to be corrected. The mouse is used to input the calculated and the observed positions of two spots. From this the program calculates the shift, rotation and scale factor required to superimpose these two spots. The values of the shifts required are given and the user given the choice of accepting the transformation or not.

Auto-refine This will refine the crystal missetting angles using the AUTOMATCH option (see help library for more details), which essentially adjusts the missetting angles to try to optimise the fit of the predicted to the observed pattern. This can converge from initial errors of 1-2 degrees, but the final parameters are NOT as accurate as those obtained from post-refinement. With the use of REFIX, this option should not normally be necessary. Note that it will use the default parameters for the refinement (eg only using data to 6A). You may wish to modify these using the appropriate keywords...see help library documentation. After refinement it will go on to measure the image. **** THIS OPTION IS NO LONGER SUPPORTED ****

Integrate This will close down the display and measure the image in the normal way. You are given the option of displaying the image again after positional refinement has been carried out, or after the integration has been carried out. (See below)

Find hkl If this menu item is activated, the user is prompted for the hkl indices of the spot he wishes to find. A blue cross is drawn on the position of the spot. A warning is given if the spot does not lie in the displayed part of the image. This can be useful in identifying bad spots.

Read spot list This allows a list of spots produced by using the "save spots" option to be displayed superimposed on the image. It is essentialy obsolete now. Note that ONLY spots from a single image (ie up to the first terminator (-99 or -999) in the spots file will be displayed).

Pick This will display the actual pixel values in a box around the cursor position (the size of the box can be set in the "Display Parameters" window.

Measure Cell Allows measurement of cell parameters, but the distance and wavelength must be set correctly.

Circles Puts up resolution circles

Fit circles Will determine the centre of a circle defined by clicking with the mouse at a series of points lying on the circle. After selecting "Fit circles" select a series of points with the mouse and then selet "Fit points". The rms fit, radius and centre of the circle will be given. The user has the option to update the direct beam coordinates to the circle centre. Used to determine the direct beam position from wax rings.

EXIT Closes down the display. If the image was being displayed with the IMAGE keyword, the mosflm prompt will return. If all keywords for processing are given, then MOSFLM will proceed to measure the image(s).

Examining the image after integration

There is the option to update the display after integration (At the bottom of the "Processing Parameters" window, this is one of a number of On/Off or Yes/No toggles). If this option is chosen, once the profiles have been determined and the image integrated, the image plus the predicted pattern will again be displayed again. (Note that there is an overhead associated with this, because the image will have to be read into memory again (unless only a single image is being processed)).

If there are any "bad spots" (ie poorly measured reflections) on the image a window will be displayed which gives the user has the option to examine and/or edit the badspots. If this option is selected, the image will be displayed with a new menu option "Bad spots". Bad spots can then be reclassified as acceptable and other (accepted) spots can be re-classified as rejected.

In addition to the predicted reflections, vectors will be drawn indicating the difference between the predicted and observed spot positions for FULLY RECORDED reflections (these vectors are in red....it may be necessary to use the "Clear Prediction" menu option to see them clearly. The vectors can be scaled using the "Vector scale" menu item in the Processing parameters window. Also a minimum intensity threshold for the display of these vectors can be set using the "Threshold" menu item.

The vectors will of course be longer for weak spots than strong ones, but for all reflections the direction of these vectors should be random. If this is not the case, it suggests errors in crystal cell parameters or orientation, or misclassification of fully recorded/partially recorded reflections, or the existence of spatial distortion which is not being correctly modelled.

"Badspots" will be indicated as red crosses, rejected reflections as blue croses. Rejected reflections will normally be those containing a zero pixel value (because the measurement box extends outside the scanned area of the image plate); these are NOT classified as "badspots". They may also arise if a very large number of background pixels have been rejected.

Overloaded reflections will be indicated as green crosses. Note that you MUST include keywords PROFILE OVERLOAD in order to estimate the intensities of overloaded reflections by profile fitting.

By clicking on a reflection, the LP corrected profile fitted intensity and standard deviation will be given in the output window.


6.3.2 Processing the first block of data (Non-interactively)

It is usually advisable to process a block of (say) 10 degrees of data prior to processing the complete dataset (as this is quite time consuming) just to check that the processing is satisfactory.

The commands for MOSFLM might now be:

TITLE Procesing test data ! Crystal parameters MATRIX oval_2_3.mat SYMMETRY 96 MOSAIC 0.1 ! image parameters IDENT oval PROCESS 1 TO 10 [ ANGLE 1.0 START 0.0 ] DIRECTORY /scr0/andrew/ EXTENSION image ! beam parameters DIVERGENCE 0.35 0.2 POLARISATION MIRRORS ! detector parameters BEAM 89.33 90.10 BACKSTOP CENTRE 88.5 91 RADIUS 14 GAIN 1.2 DISTORTION ROFF 0.3 TOFF 0.1 ! The following are read from the image header if not supplied on ! keywords for Mar, ADSC, RaxisIV and Mac Science. The phi values (on ! PROCESS keyword) will also be read from the header if not supplied. DISTANCE 124.1 WAVELENGTH 1.542 !The following are optional, if not given, the program will set !suitable defaults. !HKLOUT oval.mtz !SEPARATION 1.3 1.3 !RASTER 17 17 9 3 3 !GENFILE oval_1to1.gen !RESOLUTION 3.0 PLOT RUN

See above (6.3.1 Running MOSFLM interactively) for a description of each of the keywords.The only difference to the commands described above is that the PROCESS keyword has been set up to process the first 10 images. The ADD subkeyword on the PROCESS line specifies that the batch number on the output mtz file should be (1000+image number) (ie 1001-1010 in this example).

DISTORTION specifies the ROFF and TOFF values for this scanner. Note that it is now possible to specify the radially dependent radial and tangential offsets using sub-keywords RDROFF, RDTOFF respectively (See Appendix IV). It is not normally necessary to specify these values unless thay are large (greater than 0.3).

The program will then form the standard profiles by summing reflections over the first block of images 1 to 5 and print the resulting profiles. The number of images in a block is set by the program, but may be set explicitly by the PROFILE BLOCK keywords).

6.3.2.1 Formation of the Standard Profiles

The standard profiles are determined in a number of areas across the detector. By default the detector is divided into 9 regions for data to a resolution lower than 2.5A and 25 regions of which only 22 lie within the active area of a circular detector for resolution higher than 2.5A. Alternatively the user can define the set of lines parallel to the detector X and Y axes which define the standard areas. This is done with keyword PROFILE XLINES... YLINES...

EXAMPLE:

PROFILE XLINES 0 45 90 135 180 YLINES 0 45 90 135 180

will divide the a detector 180x180mm into 16 areas. See MOSFLM Help file for more details.

A separate standard profile is evaluated for each of these areas.

The program prints out some statistics on the standard profiles, followed by statistics on profiles that it has averaged (if any) and followed by a representation of each of the standard profiles using a single character (0- 9, then A-Z) to represent the value at each pixel (A [ denotes a negative value). In these representations, a minus sign denotes the background region, and a * denotes rejected pixels. Background pixels which are overlapped by the peak regions of neighbouring spots are automatically rejected by the program. It will also warn you if the peak regions of neighbouring spots overlap.

6.3.2.2 Inadequate profiles

Each standard profile has to satisfy two criteria before it is considered acceptable. There must be at least ten contributing reflections, and the rms variation in the background plane (after rejecting outliers) must be less than 10 (after scaling the profile to a maximum value of 255). If a profile fails to pass either test, then it is averaged using the profiles from neighbouring areas on the detector. Profile averaging should be avoided if at all possible. The averaging inevitably produces a profile that is less broad than the original profile because it is dominated by the stronger, lower resolution data. Look at the printed profiles before and after averaging to confirm this.

Accumulating the profiles over a BLOCK of say 10 images (see below) should help provide a sufficient number of reflections, but is unwise to accumulate over too wide a phi range because this will average out any genuine variation in the profiles with phi (eg due to a change in effective diffracting volume). Both rejection criteria can be changed using subkeywords (NREF for number of reflections, RMSBG for rms variation in background, on the PROFILE keyword line) and it is usually better to avoid averaging by changing these criteria if necessary:

PROFILE RMSBG 20 NREF 5

6.3.2.3 Spots running into each other

The program will give a warning if it detects that the peak areas of adjacent spots overlap. There are two possible ways around this:

1) Increase the SEPARATION parameters. Making the separation significantly smaller than the actual spot size in the centre of the image can lead to serious problems and is NOT recommended.

2) The actual spot size that the program works with when testing for peak overlaps (after rejecting those that are too close as determined by the SEPARATION parameters) is determined by the "profile optimisation"...that is when the program works out the best values of the measurement box parameters NC, NRX, NRY, which is done independently for each of the standard profiles. If there is significant diffuse scatter on the image, the "optimised" raster parameters may well produce a peak area that is actually slightly broader than the true Bragg peak and includes part of the "diffuse" peak. This can be checked by examination of the standard profiles...if the peak area contains many pixels with values of 0 or 1 then it suggests the peak is too broad. This effect can be overcome by increasing the TOLERANCE parameter (See the help library for more details on how the optimum parameters are derived and the effect of the TOLERANCE parameter). The default value for this parameter is 1% ie 0.01. Increasing it (try steps of 0.005, but it should not be necessary to go above 0.04) will result in a reduction of the optimised peak size. It is up to you to decide what the optimimum value is on the basis of the appearance of individual spots in the image.

6.3.2.4 Output Statistics

The other statistics produced are for general information, and are described in the MOSFLM help library under "Output". Probably the most useful is the breakdown of I/sig(I) as a function of resolution. This will give an immediate idea of the quality of the data...particularly at the high resolution end. For guidance, a mean I/sig(I) of 3.0 will give an R-merge of between 20% and 30% in SCALA. If there are symmetry related fully recorded (or summed partial) reflections on a single image, statistics are also provided on the agreement between their intensities.

Check the following in this job:

1) Check the standard profiles look OK (ie the peak is within the peak region).

2) Check the weighted residual is about 1.0.

3) CHECK FOR WARNING MESSAGES. These are given at the end of the logfile and in the summary file. They will point out possible problems and suggest a way around them.


6.3.3 Finally, Processing the dataset

The whole philosophy of MOSFLM is to allow the entire dataset (or all images obtained from a single crystal) to be processed in a single job. To make this possible, the crystal orientation can be refined continuously for every image, to take account of possible crystal slippage, and the cell parameters can also be refined if the initial estimates are not accurate. An accuracy of 1 part in 1000 or better is required for optimal processing of high resolution data.

There are a large number of adjustable parameters within MOSFLM, but considerable effort has gone into making the program select an appropriate value for these parameters. The program defaults should therefore ALWAYS be used unless there is a very specific reason for changing parameters (eg if it is suggested in the warning messages in the summary or logfile)

See section 9.3 for a complete example command file.


Back to Contents Pa