Job priority/scheduling
The Maui cluster scheduler has a number of advantages over TORQUE's default FIFO scheduler, including the ability to backfill shorter jobs into holes in the schedule, an advance reservation capability, and a far more sophisticated priority scheme. Together, these features improve utilization, provide a fairer allocation of resources, and allow resources to be dedicated for critical real-time applications, such as storm surge predictions or surgical image processing.
As currently configured for William & Mary HPC systems, Maui determines job priority based on the following factors:
- Job duration: short jobs don't tie up resources for long periods of time, so they have higher priority than long jobs.
- Queue age: the longer a job sits in a queue waiting to run, the higher its priority.
- Job size: wide jobs (i.e., those which need lots of nodes) are much harder to fit in to the schedule than narrow jobs, so they get higher priority. Smaller jobs are then backfilled around the large ones.
- Resource usage: Maui keeps track of per-user resource utilization over a decaying 60-day window. Users who have consumed few resources recently will have priority over those who have consumed a lot.
- User priority:
sysops
gets higher priority on the premise that timely execution of system housekeeping functions is crucial to smooth operation of the system. - Queue priority: TORQUE automatically sorts jobs into queues, and queues are prioritized based on factors such as runtime and project priority.
- Project priority: we have service-level agreements with a few research groups who have contributed substantial funding to SciClone, and Maui helps us meet these commitments through a combination of queue priorities and usage targets.
Maui also maintains a small number of compute nodes in a restricted walltime pool during weekdays. This ensures that small/short jobs will not be held up for days on end waiting for long jobs to complete.