Software on HPC
The HPC system has to serve many different users - in addition to being comprised of different kinds/sizes of nodes, it also has the capability to
provide access to different software. There are two major ways to use software on the HPC system:
- Using the modules system
- Using contianers
The first option is the more traditional route. It allows the user to configure their environment to make certain software
and libraries available, including R and some selected R packages. Subsequently, things like R packages can often
(but not always) be installed in the usual way with install.packages
.
The second option adds an additional layer of abstraction/complexity, but also provides enhanced flexibility. Containerization
is a good technology to be familiar with in general, since it has become widespread across many industries.
Before we jump into containers, let’s focus on the more traditional approach.
Working with Modules
The Basics
There are many programs and libraries already pre-installed on the HPC systems. The HPC system uses modules to manage different versions of these applications, which might have different dependencies – potentially, this can become complicated, although for most people this will be pretty straightforward.
These modules are organized into dependency groups called “stacks”. At the time of this writing (2025-10-03), the most recent stack is stack/2022.2
. You can load this stack via
Likely, you will also want to load R. At the time of this writing, the most recent version installed on argon is 4.2.2, which can be loaded via:
$
module load r/4.2.2_gcc-9.5.0
If you don’t want to load these modules every single time you log in to argon, you can save this configuration as your default via
There are a few other module commands worth being aware of:
module list
: See which modules you currently have loaded. Note that R has many dependencies, so even though we’ve only loaded one module directly, more than 60 modules are now loaded.
module purge
: Remove all currently loaded modules and return to a “blank slate”
module restore
: Restore your default configuration (e.g., if you loaded a module, and later changed your mind and want it removed)
You can also have multiple configurations stored, if you wish. For example, suppose you occasionally do something that requires extra software to be loaded. You could load those modules, then
module save my-special-configuration
: Save the current module configuration with the name my-special-configuration
(pick any name you want here)
module restore my-special-configuration
: Load that saved configuration.
Note that if you leave off the name, module save
and module restore
overwrite and load the default configuration that you have set up.
To see what modules are available in your current software stack, use the command module avail
. For example:
To see what modules are available across all software stacks, use module spider
, which has the same syntax.
Like many command line tools, the output of module spider can be searched by using the /
key, followed by the string you want to search for. You can jump between matches with the n
key.
Alternatively, you can look here at this argon software list to see a list of all the programs installed on the HPC.
Your R profile
Since you’ll likely be using R on Argon, and will likely have to install
packages, it’s worth mentioning how to set up a local directory in which to install them (otherwise, R will ask you questions about it every time you run install.packages()
).
You can put these R packages anywhere you want, although the “standard” place to put them would be in .local/lib
. So, let’s make a subdirectory there for our R packages:
And then tell R
to use that as your local library by adding this
to your .Rprofile
file (you may have to create this file first):
.Rprofile
.libPaths('~/.local/lib/R')
options(repos=c(CRAN="https://mirror.las.iastate.edu/CRAN"))
The second line is optional, but will stop R from asking you about what mirror you want to use every time you install a package.
Your profile
This is optional, but to customize your HPC setup, it is often the case that you would like to run certain commands at the beginning of every Argon session. To do that, you can place these commands in a profile file called either .bash_profile
or .bashrc
. These files should already exist when you log in for the
first time, ready for you to add things to them.
Note: The default setup on Argon is to have your .bash_profile
file call your .bashrc
file, so it doesn’t really matter where you put these commands as they’ll run either way. You can change this behavior if you wish, but only do so if you know what you’re doing and are comfortable with a nonstandard setup. In this tutorial, we’ll put our setup commands in .bash_profile
.
For example, suppose you want to install an application that is not available as a module. You download and compile it and place the compiled binary in .local/bin
. You would want to update the environment variable PATH
so that the linux command line knows to look in this location for applications. So, the first few lines of your .bash_profile
file might look like:
.bash_profile
#!/bin/bash
export PATH=~/.local/bin:$PATH:
alias R="R --no-save"
<...more...>
This sets up the Linux search path so that the OS knows where your
user-installed applications are (~/.local/bin
), and also sets up an alias so that whenever you run R, it will not save your workspace at the end of the session (something you should never do). Note that the .bash_profile
file (and .bashrc
file) must be placed in your home directory.
Containers
Learning to use containers is a bit more complicated, because they add an additional “layer of abstraction”.
You can roughly imagine containers as providing a virtual environment you can work in with a
separate set of installed programs and only limited ability to interact with the host system.
Containers can be defined declaratively via configuration files, which is a helpful feature for
reproducible computing. They are a key building block for a lot of cloud computing in industry.
For use on HPC, we’re not usually interested in more complicated containerized
development patterns (e.g., container orchestration); instead we usually just want
a convenient way to get the software we need running in a scalable environment.
As such, this tutorial will keep things pretty basic, but but a fuller treatment
can be found elsewhere. Here are a few links to get you started:
- Docker: a popular containerization tool which supports multiple platforms.
- Docker (wikipedia): a Wikipedia page with a good history and summary of containers
- Podman: a more recent alternative to Docker that is gaining widespread adoption.
- Apptainer: the containerization tool which is used on the Argon HPC system (the more widely adopted Docker and Podman container images can be converted to work with Apptainer)
- The Rocker Project: A project collecting pre-made images for the R environment.
Getting access to prebuilt container images
While containers allow us a ton of flexibility, it’s actually easier for many
typical workflows to start with a pre-built image rather than to define our own.
As most Biostatistics students work in R, we’ll focus on images provided by
The Rocker Project.
Important Notes.
- These steps can be computationally intensive, so they should not generally
be run on the login node. Please read this section and this section before
trying this yourself.
- The files we will be creating can be large - please don’t unnecessarily duplicate them.
Step 1 - Pull down the image you want
Here, we’re going to pull down the tidyverse package,
because it has a ton of dependencies, and often causes installation issues.
The following command uses the apptainer
program to create a new file, called tidyverse.sif
in your current directory. This file is of “singularity image format”, the native
image format for apptainer. The new image file will be created from
the docker image hosted online.
$
apptainer pull tidyverse.sif docker://rocker/tidyverse:devel
Step 2 - Open a shell in your container
To open a shell, you just need to do:
$
apptainer run ./tidyverse.sif /bin/bash
Then, you can open R in the usual way:
Similarly, you can exist out and back into your regular environment in the usual way:
More commonly, you’ll want to make sure the container can access and make changes
to files you have permission to modify. One way to accomplish this is
by using an overlay
, or a directory to store changes (which don’t otherwise persist).
$
mkdir tidyverse.overlay
apptainer run --overlay tidyverse.overlay ./tidyverse.sif /bin/bash
If you do this, you can install packages from within R in the usual way
with install.packages
and still have access to them later (changes won’t usually persist).
Building your own containers
To build a custom container, the most general way is to use a different computer
with access to Docker or Podman, and build a container there. Then, you can copy the sif
file
to Argon, and interact with it in the usual way.
We have put together some example scripts and a brief Readme on how to do so
in this repository