Queues

When we say that a compute node “belongs” to the Department of Biostatistics, what does that mean? The basic idea is that the node is there for us to use whenever we want, although if we’re not using it, other people are allowed to. This works both ways: if there are compute nodes not being used, we can use them in addition to our own node(s).

All this is managed through a batch queuing system – you place your computing request with the scheduler, and it decides whose jobs run on which node. As part of the request, you specify a queue (essentially, you declare which line you want to wait in). There are three important queues to know about:

There are also some other queues that could be worth knowing about if you have specific resource requests, such as high memory or GPU cards. See the HPC’s Queues and Policies page for additional details on campus queues and their limits, including guidelines for selecting a queue.

More on the BIOSTAT queue

The BIOSTAT queue consists of three machines (“nodes”), two of which are older and one of which is newer:

Typically, it doesn’t make a great deal of difference which machine your program runs on, but note that one node has more memory. The next page discusses how to force your program to run on one type of machine or the other.

The Sun Grid Engine

There are a variety of batch schedulers; the one used by the Argon cluster is the Sun Grid Engine (SGE). The next several pages discuss the SGE commands for submitting, controlling, and monitoring jobs submitted to the compute nodes.