HPC: A Quick Start Guide

Posted on: August 3, 2018

Steps To A Quick Analysis

Included on this page is a brief step-by-step guide for logging intransferring filescompiling code, and submitting a batch job via HPC subsystem.


What is Virtual ROGER?

Virtual ROGER (Resourcing Open Geospatial Education and Research) has been established using experiences gained from a NSF MRI project for enabling geospatial research and education at the CyberGIS Center. It provides hybrid computing modalities including HPC batch, data-intensive computing, and cloud computing, backed by a common data store. It is administered by the School of Earth, Society, and Environment (SESE) and is integrated with the Keeling compute cluster operated by the school.


How do I log in to Virtual ROGER?

The first step is to open your command console and type in the following CONNECTION STRING, hit enter.

$ ssh net_id@keeling.earth.illinois.edu

It will then prompt you to supply a unique password. Type in the password associated with your netID entered hit enter.

Upon successful login, you should see a welcome message along with some basic information about Virtual ROGER. You are now using a bash shell on the LOGIN NODE of the cluster.

Note: Virtual ROGER allows ssh connections only from campus networks.  If you wish to log into keeling from outside the University, start the Cisco AnyConnect VPN before logging in.

 Windows Users!

Login is unavailable via the windows command prompt. We suggest installing MobaXterm for securely logging in. The free download is available online».

 CIGI Lab member!

To grant access to Virtual ROGER, please send an email to help@earth.illinois.edu and please CC: apadmana@illinois.edu. Please specify your name, NetID, and UIN in the email.


Understanding the file-system

Now that you have connected to Virtual ROGER and are on the LOGIN NODE you have access to 3 different directories, or ‘spaces’.

The spaces are important because each space is restricted to a specific size.

  1. Home Directory – you can think of this as the top drawer in your office desk.
    • Only you and the project coordinator have access to this directory.
    • Limit: 50 GB
  2. Shared Project and Data Directory – shared space for common CyberGIS data and particular project.
    • Path: /data/cigi/ or /data/cigi/common/ or /data/cigi/<project_name>
  3. Scratch Directory – this is the temporaly working directory. The data in this directory may be lost or can be deleted without warning. 
    • Path: /data/cigi/scratch
Careful!

DO NOT use the Home Directory for computing! The size of this space is not equipped to handle any sort of computation.


Customizing your environment in the shell

The Bash shell is the default shell environment users are placed in when initially logging in. The shell can be configured in a way that optimizes and supports your unique computational needs. In order to do so, we may use the command module.

To see what modules are currently available to load into your shell:

$ module avail

To see what modules are currently loaded into your environment:

$ module list

We have built special modules for CyberGIS center available under /data/cigi/common/cigi-modules

To get the mudules:

$ module use /data/cigi/common/cigi-modules

To load the available modules in the directory:

$ module load <module_name>

Example: 

To load gdal2.3 and all its dependencies:

$ module load gnu/gnu-6.1.0 proj4/5.1.0 geos/3.6.2 gdal/2.3.0

More information about keeling software environment can be found here.


Housekeeping Tips


Copying files onto Virtual ROGER

To start working with your own data the files must be located on one of the Virtual ROGER’s nodes. In order to transfer these files from your personal desktop to the Virtual ROGER, a management tool must be used that will copy files from one to the other based on specific parameters given by the user.

Linux & Mac Users: Use the scp tool as shown below.

To copy a single file from your desktop to Virtual ROGER:

$ scp myfile netId@keeling.earth.illinois.edu:

To copy a single file from Virtual ROGER to your desktop:

$ scp netId@keeling.earth.illinois.edu:myfile
Windows Users:

MobaXterm supports the transferring of files (as well as other SSH clients – list here) for Windows users with the same commands scp and sftp.


Basics of nodes & the queue system

Since computing is not done on the login node, users must send their jobs off to the compute nodes which will then take care of the heavy lifting. To do so, we use a QUEUE SYSTEM along with SLURM batch system, which is a resource manager handling the job submissions. SLURM responds to sbatch commands, which are included in what we call a job script.

Virtual ROGER has two queues in place: batch and interactive. The batch queue is most common as the interactive queue is only available during weekday business hours.

Writing the job script

Before sending your compiled code to SLURM batch system a script must be created, your JOB SCRIPT. Begin by creating a new file in your text editor (here we will name it jobscript.sh, however, you can give it whatever name you like), typing the commands below.

On Virtual ROGER, you can use a text editor like nano (recommended for new users) or vi (for experienced users that know vi commands). You can create and start editing the job script with

nano jobscript.sh

 

jobscript.sh explanation
#!/bin/tcsh
 
#SBATCH --job-name=mpitest
#SBATCH -n 4
#SBATCH --time=48:00:00
#SBATCH --mem-per-cpu=2048
#SBATCH --mail-type=FAIL
#SBATCH --mail-type=END
#SBATCH --mail-user=seseuser@illinois.edu
 
set InputDir=/data/sesegroup/a/seseuser/inputs
 
mpirun -np 4 ./mpiModel test_setup.nml

 

  • #!/bin/tcsh

     Specifies this is a shell script

  • #SBATCH --job-name=mpitest

     The name of your job

  • #SBATCH -n 4

     The batch directive for the number of tasks to run

  • #SBATCH --time=48:00:00

     The batch directive for wall clock time, the maximum amount of time you expect your code to run

  • #SBATCH --mem-per-cpu=MB

     The batch directive for the amount of memory per CPU in megabytes

  • #SBATCH --mail-type=FAIL

     Send e-mail when the execution is aborted

  •  #SBATCH --mail-type=END

    Send e-mail when the execution is ended

  • #SBATCH --mail-user=yourNetID@illinois.edu

    Send e-mail from the batch system to the account yourNetID@illinois.edu

  • Any line starting with any non-whitespace character other than # is an executable statement, and we emphasize that no SLURM batch directives may appear after the first executable statement in a batch script
  • set InputDir=/data/sesegroup/a/seseuser/inputs

     The location to read the inputs

  • mpirun -np 4 ./mpiModel test_setup.nml

     For MPI users, this last line usually takes the form of an mpirun or mpiexec command

 


How to get GPU and Highmemory machine on Virtual ROGER?

Virtual ROGER provides compute nodes which contain Nvidia Tesla K40m GPUs and compute nodes with highmemory.

GPU nodes

To use GPU compute nodes, add the following parameter to your batch script:

#SBATCH -p gpu
#SBATCH -n 20
#SBATCH -N 1
#SBATCH --gres=gpu:K40m:1

Or you can use the parameters with the qlogin command line as follows:

qlogin -p gpu -n 20 -N 1 --gres=gpu:K40m:1

 

 We recommend the software environment for these GPU nodes be set as follows:

module purge
module add GNU
module add GPU

and that any models that use GPU capabilities be built under that environment before running.

Highmemory nodes

To use highmemory nodes, add the parameter –mem=<memory_size> to you qlogin command.

Example:

qlogin --mem=250g

 


Submit batch script

To submit your JOB SCRIPT the sbatch command is used along with the file name of your script.
 

Example:

$ sbatch myBatchScript

This is obviously a very simple example where we are using only one node. However, for more complex jobs it will be likely that multiple nodes will need to be used. In this case, users must specify how many nodes they will need along with the amount of cores. For more details/advanced examples please see the Queue System & Running Jobs.