Skip to content

RCAC Blogs

Subscribe via RSS

Introducing the New RCAC Documentation Website

We are excited to announce the launch of the redesigned RCAC Documentation Website — a unified, searchable hub for everything you need to use Purdue's high-performance computing resources. The new site is built around three goals: find information fast, stay current automatically, and learn as a community. Whether you are a first-time user or an experienced researcher, here is what is waiting for you.

Checkpointing in a Preemptible Environment

What is Preemption?

In the context of computing, preemption refers to the act of stopping or pausing one process to allow another process to run, and we say that a task, X, preempts another task, Y, when X pauses Y to allow itself to run. We can use this concept of preemption in an HPC environment to maximize the utilization of the resources in the cluster by allowing low priority jobs to be preempted by higher priority jobs since this allows us to be lenient with the resource limitations placed on the low priority jobs.

Conda vs Anaconda

On RCAC community clusters we deploy modules for both conda and anaconda. These are both separate distributions of the "conda" package manager. The conda module points towards the Miniforge distribution of conda (https://github.com/conda-forge/miniforge), while the anaconda module loads the official Anaconda, Inc. distribution of conda (https://www.anaconda.com/download).

Although these two distributions behave similarly, the components within them vary slightly:

About Slurm Fairshare on RCAC clusters

The purpose of this article is to provide a deep-dive into how Slurm assigns priority to the jobs it is scheduling. The design space for such a scheduler is very large and so there are many options that Slurm provides to accommodate a variety of different clusters/policies.