ssh netid@dcc-login.oit.duke.edu
2024 DSS Bootcamp
Dr. Alexander Fisher
August 16, 2024
After successful login, you will see
userid@dcc-login ~ $
indicating that you are on a login node.
Warning
Do not run code from the login node! The login node is only job submission, job monitoring and other lightweight tasks such as small file transfers and script editing.
This is a standard command in Slurm (Simple Linux Utility for Resource Management) to queue and run a job. Slurm is a workload manager (a collection of tools) designed to help you interact with a compute cluster.
srun
runs a parallel job
--pty
lets you interact with the node
bash
is the terminal environment to interact with
Read more about slurm from official documentation:
https://slurm.schedmd.com/overview.html
First you must make sure the correct module is loaded. For example, if you wish to run R, you must first load a version of R. To see available versions, type
A “folder” aka a “directory” is a holding place for files.
A “file* is an element of a directory, e.g. /hpc/home/username/test.r
is a file contained in the username
directory. The username
directory is inside of the home
directory.
command | action |
---|---|
$ ls |
list files in current directory |
$ pwd |
print working directory |
$ cd |
change directory example: $ cd .. to go to parent directory |
$ mkdir dname |
make directory “dname” |
$ mkdir s{1..5} |
overpowered file creation |
$ rm /path/to/file |
remove a file |
$ rm -rf dname |
recursively remove a directory and its contents |
$ rm core* |
remove all objects in the current working directory that begin with “core” |
$ wc -l |
show # lines in a file |
$ y > x.txt |
pass printed output from command “y” on the left to file “x.txt” on the right example: $ head -N file1.txt > file2.txt creates a new file called “file2” that is a replica of the first N lines of file 1 |
$ echo 'text here' >> filename |
add text to the end of a file |
$ man x |
pull up documentation for command x, example: $ man ls |
https://github.com/DukeStatSci/github_auth_guide
What if you have multiple files to run? Or what if you have a non-trivial file to run that takes some time? You can’t keep an interactive session open for ever.
The solution is to submit a job, that will run even after you log out of DCC. For this, we will create a shell script.
Exercise
test-job.sh
file-screen-log.output
to something more meaningfulpath-to-file
. Note this must be relative to where .sh
script is locatedsbatch test-job.sh
Upon successful submission, you should see “Submitted batch job X” where X is some unique id associated with the job.
Check out the resulting files with ls
. For further reading, see the DCC user guide: https://dcc.duke.edu/dcc/slurm/.
More specifically, if test-output.txt
is located in your ~
directory on the cluster, you can run
in your local terminal or command prompt.
For more options, see documentation about working with files on the cluster: https://oit-rc.pages.oit.duke.edu/rcsupportdocs/dcc/files/#how-should-i-use-dcc-storage
For info about a popular GUI for file transfers, check out https://cyberduck.io/download/