Introduction &
Computing Resources

2025 DSS Bootcamp

Colin Rundel

Welcome

Welcome to the Department of Statistical Science (DSS).

Find all slides and source code at
github.com/DukeStatSci/computing_bootcamp_2025

By the end of today’s workshop, you will

  • understand the computing resources available to you within the DSS and Duke University as well as how to get help when needed

  • work with containerized versions of R and Python and understand the importance of literate programming and reproducible research

  • collaborate and manage your work via version control with git/GitHub

  • setup and run jobs on Duke Compute Cluster (DCC)

Duke Computing Resources

Duke NetID

Your Duke NetID is the electronic key for accessing most Duke resources.

Most important resources are locked down behind the Duke shibboleth login service,

OIT Self-Service

All of the following can be accomplished at

  • Changing your password

  • Changing your challenge questions

  • Set up multi-factor authentication (Duo recommended)

  • Change email alias(es)

  • Add ssh keys and pick a default shell

Duke email

With your NetID and password, you can access your email on the web at mail.duke.edu

  • Your Duke email is not permanent; your account expires once you leave Duke.

  • Mail forwarding is possible but not recommended

  • There are limitations on the 3rd party email and calendar clients you can use with your Duke email account

Duke WiFi

Duke network connections:

  • Dukeblue (recommended)

    • Secure wireless network for Duke students, faculty, and staff, requires device registration and authentication with your Duke NetID and password
  • DukeOpen

    • “non-secured” network for miscellaneous devices such as printers, gaming consoles, or streaming TV devices that cannot otherwise be connected to the Dukeblue
  • eduroam (education roaming)

  • Visitor

Multi-factor authentication

To access some resources on campus (e.g. mail, vpn, etc.) you will need to periodically undergo MFA when you login.

This is relatively easy to setup using Duo or Duke Unlock - see instructions at https://idms-mfa.oit.duke.edu/mfa/home

Duke VPN

  • Duke’s virtual private network (VPN) allows you to create a secure connection from your computer to Duke over a public network while working remotely. This will be necessary for you to use, if you want to access certain Duke and DSS resources off campus.

  • Instructions to get started with the VPN are available on the next slide. For more information on Duke’s VPN visit https://oit.duke.edu/what-we-do/services/vpn.

Duke VPN set-up

  1. Download and install the free Cisco AnyConnect VPN software

  2. Launch Cisco AnyConnect on your machine

  3. Enter vpn.duke.edu in the box and click Connect.

  1. Another dialog box will appear. Choose -Default- from the Group dropdown menu

  2. Enter your Duke NetID and password and click OK.

  3. Follow the steps to complete MFA.

Software Site Licenses

  • Duke offers software for download to students, faculty, and staff through software.duke.edu

  • Duke negotiates with vendors to make software available to the Duke community for discounted rates or, in many cases, for free. If you have any questions, comments or suggestions, please e-mail the site-license office at site@duke.edu.

  • Some free software relevant to you as students:

    • Microsoft Office
    • MATLAB & Simulink
    • SAS
    • Mathematica
    • Tableau
    • Adobe Creative Cloud

Virtual computing

Duke OIT offers virtual software containers and semester-long virtual machines.

  • Virtual Software Containers – Students and instructors can reserve personal computer environments running applications such as RStudio, Eclipse, Jupyter Notebooks, MATLAB, and others for a semester. These are run through your web browser; no software download is required.

    Two containers likely to be most useful are:

    • RStudio - statistics application with R Markdown and knitr support
    • Jupyter - interactive data science and scientific computing notebooks

  • Virtual Machine (VM) – Your Duke VM is like having a second computer that lives at Duke. You can log into and use your VM from your own machine. It allows you to access specialized software without installing it on your own computer, host your own server for development projects and coursework, or customize your own environment to use for the semester.

    • Run Windows or Linux VMs
    • Computing resources are light: 2 processors and 2GB of memory
    • By default each VM will power down at 6:00 am every morning
      • Can be avoided if you have a reasonable use case

Research Toolkits

https://rtoolkits.web.duke.edu/ is a tool provided by Duke Research Computing to access resources for the DCC.

If you log in, everyone should be able to see that you are a member of the “statdept” group.

For PhD students - You can claim the “Doctoral Resources” allocation to get access to the doctoral student allocation on the DCC.

Duke Compute Cluster

The Duke Compute Cluster (DCC) consists of machines that the University has provided for community use and that researchers have purchased to conduct their research. You will need to be given access before use.

  • Runs Linux and uses the SLURM job management system

  • Offers over 1360 nodes with more than 45,000 vCPUs, 980 GPUs and 270TB of RAM.

  • Most nodes are purchased by labs and departments.

  • OnDemand for interactive use

  • Requires sponsorship by a faculty member to use (you all have access currently)

Duke Office of Information Technology (OIT) offers a host of resources, including workshops to help new users.

Getting help with Duke resources

Duke Office of Information Technology (OIT) manages Duke’s technology infrastructure and application support.

AI @ Duke

Recently Duke has entered into a partnership with OpenAI to provide access to GPT-4 and other tools. As graduate students you can access these services via:

There is also access to Microsoft Copilot tools via your Duke Office365 account.

DSS Computing Resources

RStudio Workbench

The DSS has a Posit / RStudio Workbench license that will allow you to connect to an instance of RStudio (or Jupyter Lab or VS Code or Positron) in your browser while making use of the computing power of a remote multiprocessor server.

To access Posit Workbench:

  1. If off campus, use the VPN to create a secure connection from your computer to Duke. If you are on campus, be sure you are connected to the DukeBlue network.

  2. Navigate to: https://rstudio.stat.duke.edu

  3. Log-in with your NetID and password.

Note - this is a small cluster of 3 machines, it is intended for teaching and light workloads. If you need more resources, you should use the DCC directly.

Getting help with DSS resources

Departmental IT resources - https://stat.duke.edu/it-support

The best way to get help with DSS computing resources is to email stat-help@duke.edu.

Science Drive - Academic Technology Support

If you are having trouble getting support contact the department’s computing committee stat-cc@duke.edu or Prof. Colin Rundel.