2025 DSS Bootcamp
Welcome to the Department of Statistical Science (DSS).
Find all slides and source code at
github.com/DukeStatSci/computing_bootcamp_2025
By the end of today’s workshop, you will
understand the computing resources available to you within the DSS and Duke University as well as how to get help when needed
work with containerized versions of R and Python and understand the importance of literate programming and reproducible research
collaborate and manage your work via version control with git/GitHub
setup and run jobs on Duke Compute Cluster (DCC)
Your Duke NetID is the electronic key for accessing most Duke resources.
Most important resources are locked down behind the Duke shibboleth login service,
All of the following can be accomplished at
Changing your password
Changing your challenge questions
Set up multi-factor authentication (Duo recommended)
Change email alias(es)
Add ssh keys and pick a default shell
With your NetID and password, you can access your email on the web at mail.duke.edu
Your Duke email is not permanent; your account expires once you leave Duke.
Mail forwarding is possible but not recommended
There are limitations on the 3rd party email and calendar clients you can use with your Duke email account
Duke network connections:
Dukeblue (recommended)
DukeOpen
eduroam (education roaming)
Visitor
To access some resources on campus (e.g. mail, vpn, etc.) you will need to periodically undergo MFA when you login.
This is relatively easy to setup using Duo or Duke Unlock - see instructions at https://idms-mfa.oit.duke.edu/mfa/home
Duke’s virtual private network (VPN) allows you to create a secure connection from your computer to Duke over a public network while working remotely. This will be necessary for you to use, if you want to access certain Duke and DSS resources off campus.
Instructions to get started with the VPN are available on the next slide. For more information on Duke’s VPN visit https://oit.duke.edu/what-we-do/services/vpn.
Download and install the free Cisco AnyConnect VPN software
Launch Cisco AnyConnect on your machine
Enter vpn.duke.edu
in the box and click Connect
.
Another dialog box will appear. Choose -Default-
from the Group dropdown menu
Enter your Duke NetID and password and click OK
.
Follow the steps to complete MFA.
Duke offers software for download to students, faculty, and staff through software.duke.edu
Duke negotiates with vendors to make software available to the Duke community for discounted rates or, in many cases, for free. If you have any questions, comments or suggestions, please e-mail the site-license office at site@duke.edu.
Some free software relevant to you as students:
Duke OIT offers virtual software containers and semester-long virtual machines.
Virtual Software Containers – Students and instructors can reserve personal computer environments running applications such as RStudio, Eclipse, Jupyter Notebooks, MATLAB, and others for a semester. These are run through your web browser; no software download is required.
Two containers likely to be most useful are:
RStudio
- statistics application with R Markdown and knitr supportJupyter
- interactive data science and scientific computing notebooksVirtual Machine (VM) – Your Duke VM is like having a second computer that lives at Duke. You can log into and use your VM from your own machine. It allows you to access specialized software without installing it on your own computer, host your own server for development projects and coursework, or customize your own environment to use for the semester.
https://rtoolkits.web.duke.edu/ is a tool provided by Duke Research Computing to access resources for the DCC.
If you log in, everyone should be able to see that you are a member of the “statdept” group.
For PhD students - You can claim the “Doctoral Resources” allocation to get access to the doctoral student allocation on the DCC.
The Duke Compute Cluster (DCC) consists of machines that the University has provided for community use and that researchers have purchased to conduct their research. You will need to be given access before use.
Runs Linux and uses the SLURM job management system
Offers over 1360 nodes with more than 45,000 vCPUs, 980 GPUs and 270TB of RAM.
Most nodes are purchased by labs and departments.
OnDemand for interactive use
Requires sponsorship by a faculty member to use (you all have access currently)
Duke Office of Information Technology (OIT) offers a host of resources, including workshops to help new users.
Duke Office of Information Technology (OIT) manages Duke’s technology infrastructure and application support.
Live chat - 24 hours a day, Monday - Thursday; chat is available on a limited basis Fridays and Sundays
Walk up hours are available at the Link in Perkins Library.
Live service status dashboard available at status.oit.duke.edu or @DukeOIT
Recently Duke has entered into a partnership with OpenAI to provide access to GPT-4 and other tools. As graduate students you can access these services via:
https://chat.ai.duke.edu - DukeGPT
https://dashboard.ai.duke.edu - MyGPTBuilder & AI Gateway
There is also access to Microsoft Copilot tools via your Duke Office365 account.
The DSS has a Posit / RStudio Workbench license that will allow you to connect to an instance of RStudio (or Jupyter Lab or VS Code or Positron) in your browser while making use of the computing power of a remote multiprocessor server.
To access Posit Workbench:
If off campus, use the VPN to create a secure connection from your computer to Duke. If you are on campus, be sure you are connected to the DukeBlue network.
Navigate to: https://rstudio.stat.duke.edu
Log-in with your NetID and password.
Note - this is a small cluster of 3 machines, it is intended for teaching and light workloads. If you need more resources, you should use the DCC directly.
Departmental IT resources - https://stat.duke.edu/it-support
The best way to get help with DSS computing resources is to email stat-help@duke.edu.
Science Drive - Academic Technology Support
If you are having trouble getting support contact the department’s computing committee stat-cc@duke.edu or Prof. Colin Rundel.