Using $TMPDIR¶

Files stored in $TMPDIR cannot be accessed from SSH sessions

The $TMPDIR variable points to a temporary directory that exists only on the compute node while your job is running. Its contents cannot be accessed directly from a login node or via the compute node via SSH.

For interactive work, use salloc to access the compute node and inspect $TMPDIR. For batch jobs (sbatch), include any commands that use $TMPDIR directly in your submission script, and copy any files you want to keep to a persistent location (for example, $HOME) before the job completes.

There is temporary space available on the nodes that can be used when you submit a job to the cluster.

As this storage is physically located on the nodes, it is not shared between nodes, but it will provide better performance for read/write (I/O) intensive tasks on a single node than networked storage. However, to use the temporary scratch space, you will need to copy files from networked storage to the temporary scratch space. In addition, if a job fails then any intermediate files created may be lost.

If your job does a lot of I/O operations to large files, it may therefore improve performance to:

copy files from your home directory into the temporary folder
run your job in the temporary folder
copy files back from the temporary folder to your home directory if needed
delete them from the temporary folder as soon as they're no longer needed

Basic example¶

The following job runs a shell-script ./runcode.sh in a data folder beneath a user's home directory. The data is held on networked storage at this point.

#!/bin/bash
#SBATCH -n 1              # Request 1 core
#SBATCH -p compute        # Request the compute partition
#SBATCH -t 1:0:0          # Request 1 hour runtime
#SBATCH --mem-per-cpu=2G  # Request 2GB RAM per core

cd $HOME/project
./runcode.sh

On any node the temporary scratch directory is accessed using the variable $TMPDIR. If specific, known files are needed in your processing, you can copy your data to that space before working on it.

The following job:

copies data.file from the project directory to the temporary area
sets the current working directory to the temporary area
runs the appropriate code
copies the output file results.data back to the project directory

This is the equivalent of the previous example, but using the temporary storage.

#!/bin/bash
#SBATCH -n 1              # Request 1 core
#SBATCH -p compute        # Request the compute partition
#SBATCH -t 1:0:0          # Request 1 hour runtime
#SBATCH --mem-per-cpu=2G  # Request 2GB RAM per core

# Copy data.file from the project directory to the temporary scratch space
cp $HOME/project/data.file $TMPDIR

# Move into the temporary scratch space where your data now is
cd $TMPDIR

# Do processing - as this is a small shell script, it is run from the network storage
$HOME/project/runcode.sh

# Copy results.data back to the project directory from the temporary scratch space
cp $TMPDIR/results.data $HOME/project/

If you do not know, or cannot list all the possible output files that you would like to move back to you home directory you can use rsync to only copy changed and new files back at the end of the job. This will save time and avoid unnecessary copying.

The following job:

copies files to the temporary scratch area
runs the shell-script ./runcode.sh on the local copy
copies the results back to networked storage

#!/bin/bash
#SBATCH -n 1              # Request 1 core
#SBATCH -p compute        # Request the compute partition
#SBATCH -t 1:0:0          # Request 1 hour runtime
#SBATCH --mem-per-cpu=2G  # Request 2GB RAM per core

# Source folder for data
DATADIR=$HOME/project

# Copy data (inc. subfolders) to temporary storage
rsync -rltv $DATADIR/ $TMPDIR/

# Run job from temporary folder
cd $TMPDIR
./runcode.sh

# Copy changed files back
rsync -rltv $TMPDIR/ $DATADIR/

Advanced example¶

This advanced example demonstrates how to trigger an action (for example saving a checkpoint file) whilst the job is running, to avoid losing the contents of $TMPDIR if the job reaches the runtime requested.

Using rsync, the following script copies the contents of $TMPDIR back to $HOME after 55 minutes in a job that requests a 1 hour runtime:

#SBATCH -n 1              # Request 1 core
#SBATCH -p compute        # Request the compute partition
#SBATCH -t 1:0:0          # Request 1 hour runtime
#SBATCH --mem-per-cpu=2G  # Request 2GB RAM per core

# Rsync data back to $HOME after 55 minutes
(
  sleep 55m
  rsync -rltv "$TMPDIR/" "$HOME/"
) &

# Job code
./runcode.sh