Duke Compute Cluster (DCC)

2024 DSS Bootcamp

Dr. Alexander Fisher

August 16, 2024

Connect and run your first job

To connect to DCC

  1. Open terminal and type the following:
ssh netid@dcc-login.oit.duke.edu
  1. Open command prompt and type the following:
ssh netid@dcc-login.oit.duke.edu
  1. Enter your passcode and complete two-factor authentication

After successful login, you will see

userid@dcc-login  ~ $

indicating that you are on a login node.

Warning

Do not run code from the login node! The login node is only job submission, job monitoring and other lightweight tasks such as small file transfers and script editing.

To reserve a compute node

srun --pty bash -i

This is a standard command in Slurm (Simple Linux Utility for Resource Management) to queue and run a job. Slurm is a workload manager (a collection of tools) designed to help you interact with a compute cluster.

  • srun runs a parallel job

  • --pty lets you interact with the node

  • bash is the terminal environment to interact with

Read more about slurm from official documentation:
https://slurm.schedmd.com/overview.html

To run code

First you must make sure the correct module is loaded. For example, if you wish to run R, you must first load a version of R. To see available versions, type

module avail R

Next, load a module,

module load R/4.4.0

Finally, call R by typing

R

Now you can run actual R code!

rnorm(5)
[1]  0.8569864 -0.2224654 -1.2851999  0.1685456 -0.1084965

To close the R session,

quit()

To close a node

exit 0

Files and job deployment

File structure and unix commands

A “folder” aka a “directory” is a holding place for files.

A “file* is an element of a directory, e.g. /hpc/home/username/test.r is a file contained in the username directory. The username directory is inside of the home directory.

command action
$ ls list files in current directory
$ pwd print working directory
$ cd change directory
example: $ cd .. to go to parent directory
$ mkdir dname make directory “dname”
$ mkdir s{1..5} overpowered file creation
$ rm /path/to/file remove a file
$ rm -rf dname recursively remove a directory and its contents
$ rm core* remove all objects in the current working directory that begin with “core”
$ wc -l show # lines in a file
$ y > x.txt pass printed output from command “y” on the left to file “x.txt” on the right
example: $ head -N file1.txt > file2.txt creates a new file called “file2” that is a replica of the first N lines of file 1
$ echo 'text here' >> filename add text to the end of a file
$ man x pull up documentation for command x, example: $ man ls

Git on the cluster

Setting up GitHub

https://github.com/DukeStatSci/github_auth_guide

Pulling files

git clone https://github.com/DukeStatSci/computing_bootcamp_2024.git

You can now run, e.g. 

Rscript path-to-file/test.R

Exercise

What is the correct path-to-file?

Submitting a job

What if you have multiple files to run? Or what if you have a non-trivial file to run that takes some time? You can’t keep an interactive session open for ever.

The solution is to submit a job, that will run even after you log out of DCC. For this, we will create a shell script.

#!/bin/bash
#SBATCH --error slurm_%a.err   #error message
#SBATCH --partition=common
#SBATCH --output=file-screen-log.output   #screen log
#SBATCH --mem-per-cpu=1G   #adjust as needed
Rscript path-to-file/test.R

Practice

Exercise

  1. Save the script on the previous slide as test-job.sh
  2. change file-screen-log.output to something more meaningful
  3. update path-to-file. Note this must be relative to where .sh script is located
  4. run sbatch test-job.sh

Upon successful submission, you should see “Submitted batch job X” where X is some unique id associated with the job.

Check out the resulting files with ls. For further reading, see the DCC user guide: https://dcc.duke.edu/dcc/slurm/.

Getting files off the server

scp <netid>@dcc-login.oit.duke.edu:<dccpath.filename> <localpath>

More specifically, if test-output.txt is located in your ~ directory on the cluster, you can run

scp netid@dcc-login.oit.duke.edu:~/test-output.txt ./

in your local terminal or command prompt.

For more options, see documentation about working with files on the cluster: https://oit-rc.pages.oit.duke.edu/rcsupportdocs/dcc/files/#how-should-i-use-dcc-storage

For info about a popular GUI for file transfers, check out https://cyberduck.io/download/