|
|
|
|
The following instructions are written for the Sherrill group and collaborators. The job classes mentioned here are only accessible to users in group sherrill.
The SP2 has a batch queue system called LoadLeveler which it uses to schedule when jobs will run on what nodes. It is necessary to tell LoadLeveler what kind of calculation you are running, in addition to the particular script or program to execute. Since each node has only 4 processors, and only limited amounts of memory and disk, it is critical to ensure that a situation never arises in which calculations are scheduled which use more processors or memory or disk than there actually is on a node. It is really up to the user to provide the information the LoadLeveler program needs to avoid catastrophic situations like this. Since LoadLeveler cannot magically know how much disk, memory, etc, your job will need, you have to tell it.
Each job needs an input file (as usual) and a special LoadLeveler File (I call them .cmd). Examples of .cmd files on cgate in ~sherrill/llscripts or just click here.
Submission directions:
To check jobs graphically:
Job classes: When submitting jobs, use the appropriate class. This helps the machine keep running at maximum efficiency.
| Job Class | Time limit | Memory limit | Comments |
| sinter | 30 minutes | 1GB RAM | Very small calculations |
| squick | 1 day | 1GB RAM | Small calculations |
| scpu | 14 days | 1GB RAM | Typical for Q-Chem |
| sio | 14 days | 1GB RAM | Heavy I/O (e.g., ACES II) |
| sram | 14 days | 4GB RAM | Heavy memory (should use not_shared if near 4GB) |
| slong | 90 days | 4GB RAM | Very long calculations (use not_shared if near 4GB) |
Note: You are strongly encouraged to begin using the ConsumableMemory option to help the SP2 avoid scheduling more jobs than will fit on a given node. We have currently set up the common nodes to accept "spillover" jobs from classes sio and scpu, but the common nodes only have 1GB (1000 MB) of memory. You can specify 500MB of memory by adding ConsumableMemory(500) to the # @ resources = ... line of your LoadLeveler file (*.cmd file).
Note: Always specify the following default values: Scratch(#), ConsumableCpus(#), and ConsumableMemory(#). Loadleveler only recognizes default values for particular classes and combinations of options.
| NODE | MEMORY (GB) | SCRATCH PAIRS | SCRATCH (GB) | FEATURES |
|---|---|---|---|---|
| csp01 | 4 | 1 | 19 | sa sherrill |
| csp02 | 4 | 1 | 19 | sa sherrill |
| csp03 | 4 | 1 | 19 | sa sherrill |
| csp05 | 4 | 2 | 22, 144 | sb sherrill scratch2 |
| csp07 | 4 | 2 | 19, 144 | sb sherrill scratch2 |
| csp09 | 4 | 2 | 19, 144 | sb sherrill scratch2 |
| csp17 | 1 | 1 | 19 | co |
| csp18 | 1 | 1 | 19 | co |
| csp19 | 1 | 1 | 19 | co |
| csp20 | 1 | 1 | 19 | co |
| csp21 | 1 | 1 | 19 | co |
| csp22 | 1 | 1 | 19 | co |
| csp23 | 2 | 1 | 19 | rx |
| csp24 | 2 | 1 | 19 | rx |
| csp25 | 4 | 1 | 19 | rb |
| csp26 | 1 | 1 | 19 | ra |
| csp27 | 4 | 1 | 19 | rb |
| csp28 | 4 | 1 | 19 | rb |
#!/bin/[shell]
# @ job_type = serial
parallel
# @ input = /dev/null
# @ output = *.stdout
# @ error = *.err
# @ initialdir = path
# @ notify_user = email address
# @ startdate = mm/dd/yyyy
hh:mm:ss
# @ class = scpu
sram
sio
squick
sinter
slong
# @ notification = complete
always
error
never
start
# @ checkpoint = no
user
system
# @ restart = no
yes
# @ requirements = (Arch == "power3")
(OpSys == "AIX43")
(Feature == "sa" )
(Feature == "sb" )
(Feature == "sherrill")
(Feature == "scratch2")
# @ node_usage = not_shared
shared
# @ resources = Scratch(#) // # in pairs default 0
ConsumableCpus(#) // default 1
ConsumableMemory(#) // # in MB default 256
# @ queue =
# @ node = #
# @ tasks_per_node = #
[executable] [arguments]