Richards Center at Yale University |
RC Home | Search | Table of Contents | General Information |
Last Modified: Thursday, 15-Jun-2017 18:52:15 EDT
Table of Contents |
CSB Policy on Batch QueuesThe CSB policy on batch queues and compute-intensive jobs requires that batch queues be used in many instances. See the Batch Queues and Compute-Intensive Jobs page of the Policies and Practices section.What are Batch QueuesBatch queues are implemented in the CSB Core through the Sun Grid Engine (SGE). Batch queues allow one or more computer programs to be combined into a job, which is executed on one of several computers. Batch queues provide the following features: |
Batch queues are implemented in the CSB Core through the Sun Grid Engine (SGE). Batch queues allow one or more computer programs to be combined into a job, which is executed on one of several computers. Batch queues provide the following features:
Four types of queue are available based on maximum clock time, and one queue based on type of job (GPU-enabled cryo-EM applications). The resources short, medium, long and sponge specify that your job will take no more than 4 hours, 24 hours, 4 days and 28 days respectively on all servers. On a given host, short jobs will be allocated more time than medium jobs, which will get more time than long jobs. Sponge jobs, however, will generally get no time if any other category is also running. The queuing system will kill jobs which exceed the stated time limit.
There is also a separate 7-day "cryo" queue for running on a multi-GPU server. This is primarily for cryoEM jobs (hence the name), but molecular dynamics will soon be available as well.
To submit a job named test.com to a medium queue on the most-available computer, you type:
qsub -q medium test.com
For other examples, see the section, Common SGE Commands.
If you incorrectly specify the resource(s) that you need, SGE will complain. For example, if you type, qsub -q med ... instead of qsub -q medium ..., your job submission will fail with an error message.
Usually, you will want to run your job in a queue with the shortest time limit, and therefore the highest priority. IE, if you know your job will take about 1 day, you will want to run it in a long queue. The simplest thing is to use the command,
qsub -q long (jobname)
At this writing (2/2011), the SGE system will select which computer will run your job based on current job load, with the more powerful systems given greater weight. If this is a problem, see Miscellaneous Information below.
These queues are available, as of 15 June 2017. However, the
configuration changes faster than the documentation. For a list of
active queues, use the command, qstat -f.
Host | Type |
#CPUs |
Mem/ (GB) |
short (4 hr) |
medium (24 hrs) |
long (4 days) |
sponge (28 days) |
cryo (7 days) |
crunch6 | linux | 12 | 64 | X | X | X | X | |
cryocrunch | linux | 16 | 64 | X |
1. Create a SGE script file for your job. This is simply a script for your favorite shell, containing the commands that you want to execute in batch. Unless you specifically include a line to "cd" to a particular directory at the beginning of your script, your job will be run from your home directory, so define your file paths accordingly.
2. Submit your job to the appropriate queue, using the qsub command.
qsub -q medium test.com qstat -f -q medium qdel 67
qsub -q medium test.com qsub -e test.err -o test.out -q short-crunch3 test.com
See the SGE documentation for a list of all possible options. Generally, options can be included on the command line, or in your command file, as shown in the example in the section on Setting up a SGE Job.
qstat qstat -f -q long
Each DQS job has a unique job-id. The job id is reported when you submit the job, and can be discovered at any time using the command qstat. To delete a job, use the command
qdel jobid
With the current SGE setup, it is no longer possible to select a particular computer in the batch queuing system for running your job. Since all of the batch computers are supposed to be configured the same, this should not normally be a problem. However, if you find that your job doesn't run when submitted to a particular machine, contact the Core Staff for assistance.
The documentation that came with the SGE system is available in PDF format. SGE is a large and complex system, of which we use only a small part, so the full document set can be overwhelming.
A User Guide describes the concepts and workings of SGE.
The Installation and Administration manuals might be of some help to the staff.
Documentation for the old DQS software is also available: User Guide, Reference, and Installation/Maintainance.
Last Modified: Thursday, 15-Jun-2017 18:52:15 EDT
RC Home | Search | Table of Contents | General Information |
Richards Center (www.rc.yale.edu) at
Yale University (www.yale.edu)
Contact: michael^strickler_at_yale^edu |