Anvil Policies¶

Software Installation Request Policy¶

The Anvil team will go to every reasonable effort to provide a broadly useful set of popular software packages for research cluster users. However, many domain-specific packages that may only be of use to single users or small groups of users are beyond the capacity of staff to fully maintain and support. Please consider the following if you require software that is not available via the module command:

If your lab is the only user of a software package, Anvil staff may recommend that you install your software privately, either in your home directory or in your allocation project space. If you need help installing software, the Anvil support team may be able to provide limited help. As more users request a particular piece of software, Anvil may decide to provide the software centrally. Matlab, Python (Anaconda), NAMD, GROMACS, and R are all examples of frequently requested and used centrally-installed software. Python modules that are available through the Anaconda distribution will be installed through it. Anvil staff may recommend you install other Python modules privately. If you're not sure how your software request should be handled or need help installing software please contact us at Help Desk..

Helpful Tips¶

We will strive to ensure that Anvil serves as a valuable resource to the national research community. We hope that you the user will assist us by making note of the following:
You share Anvil with thousands of other users, and what you do on the system affects others. Exercise good citizenship to ensure that your activity does not adversely impact the system and the research community with whom you share it. For instance: do not run jobs on the login nodes and do not stress the filesystem.
Help us serve you better by filing informative help desk tickets. Before submitting a help desk ticket do check what the user guide and other documentation say. Search the internet for key phrases in your error logs; that's probably what the consultants answering your ticket are going to do. What have you changed since the last time your job succeeded?
Describe your issue as precisely and completely as you can: what you did, what happened, verbatim error messages, other meaningful output. When appropriate, include the information a consultant would need to find your artifacts and understand your workflow: e.g. the directory containing your build and/or job script; the modules you were using; relevant job numbers; and recent changes in your workflow that could affect or explain the behavior you're observing.
Have realistic expectations. Consultants can address system issues and answer questions about Anvil. But they can't teach parallel programming in a ticket and may know nothing about the package you downloaded. They may offer general advice that will help you build, debug, optimize, or modify your code, but you shouldn't expect them to do these things for you.
Be patient. It may take a business day for a consultant to get back to you, especially if your issue is complex. It might take an exchange or two before you and the consultant are on the same page. If the admins disable your account, it's not punitive. When the file system is in danger of crashing, or a login node hangs, they don't have time to notify you before taking action.

Acceptable Purdue IT Research Resource Use¶

RCAC requires all users of its research computing resources to submit their jobs through the provided queuing systems. Do not attempt to bypass or hinder these systems. Documentation for the queuing systems on each major computing resource is available on this web site. Please be as accurate as possible about the resources which your jobs will require, as this will help the system run more efficiently and may help your jobs run more quickly. You must specify the actual number of processor cores that each job will use when submitting your jobs. Failure to do so could result in poor performance and may adversely impact other users' work.

All users of research resources must also comply with Purdue IT's Resource Acceptable Use Policy, Purdue University Policy V.4.1, as well as Purdue's Remote Access to IT Resources Policy, Purdue University Policy V.1.6.

Data about activity on research computing resources is routinely logged, archived and analyzed, to provide feedback about system performance, resource and software utilization, and better optimize the resources. This includes, but not limited to, environment modules loaded, applications run, hardware performance counters, disk storage consumed, and the contents of batch job scripts.

Scratch File Purging¶

Following good data practices makes scratch more performant and more useful for everyone. By keeping master copies of important data in backed-up storage, staging only the inputs and outputs you need for active jobs into scratch, and promptly moving valuable results back to a persistent location when your work completes, you help keep scratch fast, responsive, and available for high-throughput workloads across the system.

All users of research computing systems are provided with a scratch directory for short-term, high-performance storage of data used by running jobs and workflows. Scratch is temporary space and is not intended for long-term or irreplaceable data. There is no backup service for scratch directories, and files not accessed or modified within the configured age threshold will be removed automatically. In the event of a disk crash or file removal, files in scratch directories are not recoverable.

Important data should always be copied to a backed-up or archival storage system for long-term retention, such as your project space.

Purge Policy

Scratch directories are purged on the basis of last access time and content modification time of an individual file. Any file not accessed or had content modified in 30 days will be subject to purge. Changing file metadata, such as file name or permissions, does not protect a file from purging.

Warning

Bulk operations that update metadata or otherwise “touch” large numbers of files without genuine use are discouraged and may result in removal of access.

Purge processes run regularly and may remove any file that is older than the relevant purge age at the time of the run. Files may occasionally remain slightly longer than the nominal age, but they can be deleted at any time once they are eligible. The only safe assumption is that any file older than the purge age may be removed without notice.

Scratch Space Considerations

Cluster scratch space is for limited-duration, high-performance storage of data for running jobs or workflows and is not intended for long-term storage of data, applications, or other files. Old data in scratch filesystems is periodically purged to keep the filesystem performant and to ensure space remains available for active work. Scratch filesystems are engineered for capacity and performance and are not protected by backup technology; some types of failures can result in permanent data loss.

If losing a file in scratch would significantly impact your research, that file should have a current copy in a more durable storage location such as your project space.

Recommendations

To use scratch safely and effectively:

Keep a primary copy in long-term storage.
Store important data, research results, and software in backed-up or archival storage (e.g., home directory or project space) and only copy working sets into scratch while they are actively in use.
Stage data into scratch for jobs.

At the start of a job or workflow, copy required inputs from project space into scratch to take advantage of local performance, rather than running directly from archival locations.
Automatically copy results back out.

Add steps to your job scripts or workflows that copy important outputs from scratch back to project space before the job completes.
Clean up regularly.

Remove temporary and intermediate files as part of your job scripts or periodic housekeeping so that only active, necessary data remains on scratch.
Monitor your usage and file ages.

Periodically check your scratch usage and proactively move or delete old files.
Design workflows assuming purge.

Assume that scratch can be purged or lost, and that files older than 30 days may disappear at any time. Workflows should be able to recreate or re-stage data from your project space and should not depend on scratch as the only copy of important files.

Please contact us if you have questions or need assistance in copying your files to a more permanent location such as the your project space.

Acceptable Use

The scratch filesystems are for limited-duration, high-performance storage of data for running jobs or workflows and are explicitly not intended to be used as a long-term storage. Doing so, or engaging in measures to circumvent purging, is adversely affecting all users of the system and is considered a violation of Acceptable Research Resource Use.