Software on HPC

The HPC system has to serve many different users - in addition to being comprised of different kinds/sizes of nodes, it also has the capability to provide access to different software. There are two major ways to use software on the HPC system:

  1. Using the modules system
  2. Using contianers

The first option is the more traditional route. It allows the user to configure their environment to make certain software and libraries available, including R and some selected R packages. Subsequently, things like R packages can often (but not always) be installed in the usual way with install.packages.

The second option adds an additional layer of abstraction/complexity, but also provides enhanced flexibility. Containerization is a good technology to be familiar with in general, since it has become widespread across many industries.

Before we jump into containers, let’s focus on the more traditional approach.

Working with Modules

The Basics

There are many programs and libraries already pre-installed on the HPC systems. The HPC system uses modules to manage different versions of these applications, which might have different dependencies – potentially, this can become complicated, although for most people this will be pretty straightforward.

These modules are organized into dependency groups called “stacks”. At the time of this writing (2025-10-03), the most recent stack is stack/2022.2. You can load this stack via

$
module load stack/2022.2

Likely, you will also want to load R. At the time of this writing, the most recent version installed on argon is 4.2.2, which can be loaded via:

$
module load r/4.2.2_gcc-9.5.0

If you don’t want to load these modules every single time you log in to argon, you can save this configuration as your default via

$
module save

There are a few other module commands worth being aware of:

You can also have multiple configurations stored, if you wish. For example, suppose you occasionally do something that requires extra software to be loaded. You could load those modules, then

Note that if you leave off the name, module save and module restore overwrite and load the default configuration that you have set up.

To see what modules are available in your current software stack, use the command module avail. For example:

$
module avail ^r-

To see what modules are available across all software stacks, use module spider, which has the same syntax. Like many command line tools, the output of module spider can be searched by using the / key, followed by the string you want to search for. You can jump between matches with the n key.

Alternatively, you can look here at this argon software list to see a list of all the programs installed on the HPC.

Your R profile

Since you’ll likely be using R on Argon, and will likely have to install packages, it’s worth mentioning how to set up a local directory in which to install them (otherwise, R will ask you questions about it every time you run install.packages()).

You can put these R packages anywhere you want, although the “standard” place to put them would be in .local/lib. So, let’s make a subdirectory there for our R packages:

$
mkdir -p .local/lib/R

And then tell R to use that as your local library by adding this to your .Rprofile file (you may have to create this file first):

.Rprofile
.libPaths('~/.local/lib/R')
options(repos=c(CRAN="https://mirror.las.iastate.edu/CRAN"))

The second line is optional, but will stop R from asking you about what mirror you want to use every time you install a package.

Your profile

This is optional, but to customize your HPC setup, it is often the case that you would like to run certain commands at the beginning of every Argon session. To do that, you can place these commands in a profile file called either .bash_profile or .bashrc. These files should already exist when you log in for the first time, ready for you to add things to them.

Note: The default setup on Argon is to have your .bash_profile file call your .bashrc file, so it doesn’t really matter where you put these commands as they’ll run either way. You can change this behavior if you wish, but only do so if you know what you’re doing and are comfortable with a nonstandard setup. In this tutorial, we’ll put our setup commands in .bash_profile.

For example, suppose you want to install an application that is not available as a module. You download and compile it and place the compiled binary in .local/bin. You would want to update the environment variable PATH so that the linux command line knows to look in this location for applications. So, the first few lines of your .bash_profile file might look like:

.bash_profile
#!/bin/bash
export PATH=~/.local/bin:$PATH:

alias R="R --no-save"
<...more...>

This sets up the Linux search path so that the OS knows where your user-installed applications are (~/.local/bin), and also sets up an alias so that whenever you run R, it will not save your workspace at the end of the session (something you should never do). Note that the .bash_profile file (and .bashrc file) must be placed in your home directory.

Containers

Learning to use containers is a bit more complicated, because they add an additional “layer of abstraction”. You can roughly imagine containers as providing a virtual environment you can work in with a separate set of installed programs and only limited ability to interact with the host system. Containers can be defined declaratively via configuration files, which is a helpful feature for reproducible computing. They are a key building block for a lot of cloud computing in industry.

For use on HPC, we’re not usually interested in more complicated containerized development patterns (e.g., container orchestration); instead we usually just want a convenient way to get the software we need running in a scalable environment. As such, this tutorial will keep things pretty basic, but but a fuller treatment can be found elsewhere. Here are a few links to get you started:

Getting access to prebuilt container images

While containers allow us a ton of flexibility, it’s actually easier for many typical workflows to start with a pre-built image rather than to define our own. As most Biostatistics students work in R, we’ll focus on images provided by The Rocker Project.

Step 1 - Pull down the image you want

Here, we’re going to pull down the tidyverse package, because it has a ton of dependencies, and often causes installation issues.

The following command uses the apptainer program to create a new file, called tidyverse.sif in your current directory. This file is of “singularity image format”, the native image format for apptainer. The new image file will be created from the docker image hosted online.

$
apptainer pull tidyverse.sif docker://rocker/tidyverse:devel

Step 2 - Open a shell in your container

To open a shell, you just need to do:

$
apptainer run ./tidyverse.sif /bin/bash

Then, you can open R in the usual way:

$
R

Similarly, you can exist out and back into your regular environment in the usual way:

$
exit

More commonly, you’ll want to make sure the container can access and make changes to files you have permission to modify. One way to accomplish this is by using an overlay, or a directory to store changes (which don’t otherwise persist).

$
mkdir tidyverse.overlay
apptainer run --overlay tidyverse.overlay ./tidyverse.sif /bin/bash

If you do this, you can install packages from within R in the usual way with install.packages and still have access to them later (changes won’t usually persist).

Building your own containers

To build a custom container, the most general way is to use a different computer with access to Docker or Podman, and build a container there. Then, you can copy the sif file to Argon, and interact with it in the usual way.

We have put together some example scripts and a brief Readme on how to do so in this repository