Introduction &
Computing Resources

2022 DSS Bootcamp

Colin Rundel

08-25-22

Welcome

Welcome to the Department of Statistical Science (DSS). The following slides, slide decks, and examples will serve to

  • give you an understanding of the computing resources available to you within the DSS and Duke University;

  • inform you on the best way to get help with your computing needs within the DSS and Duke University;

  • introduce you to R, Python, and version control with Git/GitHub;

  • highlight the importance of reproducible research and how the aforementioned software can help.

Duke Computing Resources

Duke NetID

Your Duke NetID is the electronic key to making many Duke resources work.
Make sure your Duke Account has been setup.

All of the following can be accomplished at idms-web.oit.duke.edu/portal/:

  • Changing your password

  • Changing your challenge questions

  • Setup multi-factor authentication (Duo recommended)

  • Change email alias(es)

  • ssh keys + default shell

Duke email

With your NetID and password, you can access your email on the web at mail.duke.edu

  • Your Duke email is not permanent; your account expires once you leave Duke.

  • Mail forwarding is possible but not recommended

Duke WiFi

Duke network connections:

  • Dukeblue:

    • 24-hour access to secure (encrypted) wireless throughout Duke’s residence halls, academic and administrative buildings
  • DukeOpen:

    • unencrypted wireless access for devices such as gaming systems, or other devices
  • Eduroam (education roaming):

    • secure (encrypted) wireless access using your Duke NetID and password

    • To use eduroam at a participating institution, configure your machine ahead of time while at Duke - https://dukeblue.duke.edu/eduroam/.

Duke VPN

  • Duke’s virtual private network (VPN) allows you to create a secure connection from your computer to Duke over a public network while working remotely. This will be necessary for you to use, if you want to access certain Duke and DSS resources off campus.

  • Instructions to get started with the VPN are available on the next slide. For more information on Duke’s VPN visit https://oit.duke.edu/what-we-do/services/vpn.

Duke VPN set-up

  1. Download and install the free Cisco AnyConnect VPN software

  2. Launch Cisco AnyConnect on your machine

  3. Enter vpn.duke.edu in the box and click Connect.

  1. Another dialog box will appear. Choose -Default- from the Group dropdown menu

  2. Enter your Duke NetID and password and click OK.

  3. Follow the steps to complete MFA.

Software Site Licenses

  • Duke offers software for download to students, faculty, and staff through software.duke.edu

  • Duke negotiates with vendors to make software available to the Duke community for discounted rates or, in many cases, for free. If you have any questions, comments or suggestions, please e-mail the site-license office at site@duke.edu.

  • Some free software relevant to you as students:

    • Microsoft Office
    • MATLAB & Simulink
    • SAS
    • Mathematica
    • Tableau
    • Adobe Creative Cloud

Virtual computing

Duke OIT offers virtual software containers and semester-long virtual machines.

  • Virtual Software Containers – Students and instructors can reserve personal computer environments running applications such as RStudio, Eclipse, Jupyter Notebooks, Matlab, and others for a semester. These are run through your web browser; no software download is required.

    Two containers liekly to be most useful are:

    • RStudio - statistics application with Rmarkdown and knitr support
    • Jupyter - interactive data science and scientific computing notebooks

  • Virtual Machine (VM) – Your Duke VM is like having a second computer that lives at Duke. You can log into and use your VM from your own machine. It allows you to access specialized software without installing it on your own computer, host your own server for development projects and coursework, or customize your own environment to use for the semester.

    • Run Windows or Linux VMs
    • Computing resources are light: 2 processors and 2GB of memory
    • By default each VM will power down at 6:00 am every morning

Duke Compute Cluster

The Duke Compute Cluster (DCC) consists of machines that the University has provided for community use and that researchers have purchased to conduct their research. You will need to be given access before use.

  • Runs Linux (CentOS 8) and uses the SLURM job management system

  • Offers over 1300 nodes with more than 30,000 cores, 750 GPUs, 200 TBs of RAM and 7 PB file system.

  • Most nodes are purchased by labs and departments. The DSS has three nodes (more soon).

  • OnDemand for interactive use

  • Requires sponsorship by a faculty member to use

The DCC User Guide will help you get up and running. They also host workshops to help new users.

Getting help with Duke resources

Duke Office of Information Technology (OIT) manages Duke’s technology infrastructure and application support.

DSS Computing Resources

RStudio Workbench

The DSS has an RStudio Workbench (formerly Pro) license that will allow you to connect to an instance of RStudio (or Jupyter Lab or VS Code) in your browser while making use of the computing power of a remote multiprocessor server.

To access RStudio Pro:

  1. If off campus, use the VPN to create a secure connection from your computer to Duke. If you are on campus, be sure you are connected to the DukeBlue network.

  2. Navigate to: http://rstudio.stat.duke.edu:8787

  3. Log-in with your NetID and password.

Other resources

Getting help with DSS resources

The best way to get help with DSS computing resources is to email stat-help@duke.edu. One of our great IT staff members will get back with you ASAP.

Zoyia Melton - Senior IT Analyst

  • Phone: (919) 684-5419
  • Location: 027 Old Chemistry

Science Drive - Academic Technology Support

If you are having trouble getting support contact the department’s computing committee stat-cc@duke.edu.