*Copyright notices *The wARP idea has been developed by Anastassis Perrakis*Titia K*Sixma*Keith S*Wilsonand Victor Lamzin*and is described into *wARP*Improvement and extension of initial crystallographic phases by weighted averaging of multiple re*neddummy models** submitted for publication* *ARP and rename wat code is written by Victor Lamzin* *wARP scripts are written by Anastassis Perrakis* *When to use wARP wARP is usable when an initial map is available from MIR*MAD*SIRAS or whatever*This map should be of good enough quality to recognize at least parts of the structure*** wARP will result in various degrees of improvement* depending on the resolution of the data and the quality of the initial map* In most cases it will give a map of excellent quality which will speed up model building and will also enable some automatic interpretation programs to work much better * Remember*Your NATIVE data should extend to a resolution of at least *** * A *OR*to put it in a better way*You should have at least *** observations for every expected atom in the protein - including waters * *Installation You also need CCP*programs and ARP running*ARP must be obtained from Victor*embl*hamburg*de* You also need the *dmoleman program* from the Rave suite by Gerard Kleywegt*available in the O WWW site* Reading this*means most likley that you already unzipped and untared the wARP*tar*gz *le* For the moment you need Irix *** to run wARP*The only thing to do now is to add these two lines to your setup *le* *cshrc* set path * * *path your*warp*directory * setenv warpbin your*warp*directory * *Running wARP First*create a directory where you plan to run wARP and follow the steps below* *** Getting the initial map First*you need to have a map on a *negrid **** * A **covering a whole assymetric unit as de*nedin ARP documentation* To calculate this map you should use your best phase set*If your experimen tal phases extend to worse than *** * A resolution*you should apply some phase extension *withie DM*to that resolution **** * A ** Do NOT extend the phases too much*ie from *** * A to *** * A *ifyou have native data to *** * A that is ***** However if you have good experimental phases to say *** * A you might consider extending by ******* * A * The map should be in CCP*format *I recommend to always *extend* the map to the correct assymetric unit after **t** sometimes the last section is missing - this will cause ARP to complain and stop execution* Name that *learp in*map *** Getting the *seed* *le Take a helix or so from any protein *around*** atoms* and place it approxi mately in the core *centre of gravity* of your MIR map* in protein region* do not bother to *t it exactly in density*** Save that *lein pdb format*without any REMARKS*SCALE cards or whatsoever*Here is an example* ATOM *OW*WAT A ******* ****** ****** **** **** * ATOM *OW*WAT A ******* ***** ****** **** **** *ATOM *OW*WAT A ******* ****** ****** **** **** *ATOM *OW*WAT A ******* ****** ****** **** ***** *ATOM *OW*WAT A ****** ****** ****** **** **** *ATOM *OW*WAT A ******* ***** ****** **** **** *ATOM *OW*WAT A ******* ****** ****** **** ***** *ATOM *OW*WAT A ****** ****** ****** **** ***** *ATOM *OW*WAT A ****** ****** ****** **** ***** *ATOM ** OW*WAT A ** ****** ****** ****** **** ***** *ATOM ** OW*WAT A ** ***** ****** ****** **** ***** *ATOM ** OW*WAT A ** ****** ***** ****** **** ***** *ATOM ** OW*WAT A ** ****** ****** ****** **** ***** *Name that *leseed*brk * *** Setting wARP parameters Copy the *le*warpbin*warp*par to the current directory*It is quite abvious what to edit there *Take a quick look at the example* *Systemparameters * *CCP*setup command*like* * set ccp*init**source*nfs*home*perrakis*psp*au to*ccp**setup* * *set machine**den set machine**olijf set machine* * berk set machine* * spar set machine* * den set machine* * linde * ************************** **** **** *** **** **** **** *** **** **** **** *Additional required input* *seed*brk* Initial atoms*NO header * *arp*in*map *Input map on fine grid **Use extend to extend it in ARP assymetric unit * *Unit cell *** set cell * ******** ****** ****** ***** ****** ****** * *Symmetry set sym **** *Space group for sfall if LS is used* *Data should be extended to that spacegroup set sfsg **P** * *Mtzfile with native data set data ***nfs*home*perrakis*psp*p sp*nat** *mtz* * * *Resolution limits set resol **** **** * * Grids for SF and FFT set grid ***** *** **** * *Asym* unit limits compatible with arp set xyzlim * ** *** * *** * **** * * B factor from Wilson plot set wilsonb * **** * * Protein size in atoms * *NO waters* set proteinsize * **** * * Refinement method * Uncomment for Max Likelihood set method * *ML* * Uncomment for Least Squares *set method * *LS* * * Cycles for initial * models and then for all *set firstref * ** set secondref * ** set thirdref * ** Some considerations are* *Usually ccp*tries to de*neenvironment MANPATH as complementary to existing MANPATH*During execution of remote shells MANPATH does not exist*and crashes remote scripts *Copy the ccp**setup *le to a local directory*and simply remove the line setenv MANPATH*and then set *ccp*init* to that *le * * *Another annoying habbit in CCP*is checking if a *leexists and if yes not overwrite it*Please change the line setenv CCP*OPEN NEW to setenv CCP*OPEN UNKNOWN*I am doing normally some checking* but it is quite likely if you had a crash to start getting errors*so play it safe* *Well*how many cycles to run *If you have time and *machines*run ***** * ** *If not*before decreasing cycles*edit warp all*shand comment out the command runnign the script re*ne* and set thirdref in warp*par to **Or *ndsome other logical way to compensate anyhow * *warp can be run in a parallel manner*by submitting di*erent jobs in di*erent machines*You have to setup machine names of your cluster* All of them must share common disk space*have you as a user and be speci*edin *rhosts*le*** ie you must be able to use them without logging in with password *ie*in priciple if you say *rshothermachine *echo OK*you should get an OK in your screen*Ask the system man ager for help*Machine names can be dublicate*triplicate or whatever* It is obvious you should use more than once the most powerfull *ie multiprocessor ** machine - if not *machines are available*Also it is better to keep ***** as seperate as possible and if necessary to create redundancy coupel *** *** and *** * *Which re*nement method to use*Normal least squares *LS* or max imum likelyhood *ML** My hint is*If you have good data to around *** * A **morethan ** observations per atom*use ml*If not use ls*If in doubt try both and check R factors * *Use your low resolution re*ections*** * A or so** if you plan to use ML* The default is to USE bulk solvent scaling* if low resolution is missing* it will crash*If you want ML but have no good low resolution*edit the *le*warpbin*mlarp*com and change *BULK* to *SIMPLE** *If you use LS extend your data in a suitable space group*with cad* During averaging wARP will anyhow extend them to P** a bit stupid I admitt* but the only way to make it easily space group independent* * *** Running and Checking log *le *** should be easy* All you have to do is to run warp all*shBrie*ywhat it does* **Puts lots and lots of atoms in the initial map density*according to certain density and geometric criteria* **From these creates three models* **Re*nesthese models for *rstref cycles* **Shakes these three models and resets B factors in order to create three new models* ** Re*nesall six models for secondref cycles* ** Reject in one round lots of atoms in low density and then slightly shakes*resets B*sand *nallyre*nesall six models for thirdref cycles* ** Does the averaging and calculates a map* ** Cleans up all intermidiate *lesand saves essential ones and then sum marizes log *les* Everything is stores in directory log *les* Please note* *Most important is maybe that warp all*shmust be run in the fore ground*csh does not allow to issue rsh commands in the background* which you absolutely need*unless you are willing to spend *times more time or you have a single machine*A bit unconvenient but you *dbetter live with it * *In case of crashed at some stage*you should keep in mind that warp all*sh executes *other scripts*build*re*ne** re*ne** cleanup average*if you wish to start from *themiddle*of the procedure you can call these scripts directly*If*for example*one machine goes down at cycle ** or so and everything crashes before the end* you can just run re*ne* and then average*you do not have to build and re*ne* again *A good trick to save more time is*If the crash was at cycle *** ie in ALL * dirs you * have junk *** edit warp*par set *rstref to ** *NO this is NOT a typo* ******** please do not ask why **** and secondref to as many cycles as you wish* *Examine the summaries of arp log *les*If you suspect that R factors can go lower*you should run a few more cycles*You can do that the same way as before OR shake *rstthe models a bit*I strongly recommend the latter*All you have to do is set *rstref to an arbitary number*say **** and endref to the number of cycles you wish to run* Then run warp more*sh*This will copy from the directory log *les your latest re*nedmodel *lesto new re*nement directories*run the re*nement*do the averaging and cleanup *Note that in log *lesyour **nal*brk *les are now the ones after the additional cycles*and the arp **summarylog *lesare also referring to the last cycles * *Rfactors should converge to ***** *for succesfull completion of the job*If re*nement is not converge*try switching to LS *orML*and rerun*If still not succesfull grow better crystals or get new derivatives* More extended log *lesand intermidiate pdb *leswill be stored at subdirec tory log *les* All other *lescreated during execution will be deleted * The warp scripts should do everything for you*and it will result to a single CCP*format map *le* warp FoFOM*map*Display it in the graphics and have fun *** *** Sample timings *** I use our cluster here*with *machines for testing*One machine has to run two jobs*Four are Indy*swith R**** *** MHz processors and ** MBytes of memory*except one which has *** MBytes and which I use for running both the *main* job and the *double* job*One is an Indigo with a slower *** MHz processor*The protein is the same as in the example warp*par*le* around **** atoms re*nedagainst more or less **** re*ections* re*ned for ** * ** cycles* Here are some characteristic *les* with their creation time next to them*so you can see how long things take *** * NOT YET