Using R

R is a free software environment for statistical computing and graphics. It is available as a command line application through modules on SeaWulf or as a GUI through Rstudio. 

This KB Article References: High Performance Computing
This Information is Intended for: Instructors, Researchers, Staff
Created: 02/13/2017 Last Updated: 05/15/2024

Accessing R

 

Command line

The command line interface to R is available through:

module load R/4.2.1
R

R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

 

Graphical User Interface

The most convenient and recommended option for accessing a GUI version of R is via modified version of the Rocker implementation of Rstudio Server.  This will allow you to run Rstudio using the computational resources on a compute node but access the GUI via your local computer's browser.

The following Slurm script will launch an instance of Rstudio server on a compute node using Singularity (note that you may want to change the SBATCH flags to match the needs of your specific job):

 

#!/bin/sh
#SBATCH --time=08:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=28
#SBATCH -p long-28core
#SBATCH --output=rstudio-server.log.%j

# set the path to the directory containing the rstudio server singularity image
RSTUDIO_DIR=/gpfs/software/rstudio/server/singularity/4.3

# set the port to be used on local machine.  Customize as necessary
LOCALPORT=8003

# Create temporary directory to be populated with directories to bind-mount in the container
# where writable file systems are necessary. Adjust path as appropriate for your computing environment.
workdir=$(python -c 'import tempfile; print(tempfile.mkdtemp())')

mkdir -p -m 700 ${workdir}/run ${workdir}/tmp ${workdir}/var/lib/rstudio-server
cat > ${workdir}/database.conf <<END
provider=sqlite
directory=/var/lib/rstudio-server
END

# Set OMP_NUM_THREADS to prevent OpenBLAS (and any other OpenMP-enhanced
# libraries used by R) from spawning more threads than the number of processors
# allocated to the job.
#
# Set R_LIBS_USER to a path specific to rocker/rstudio to avoid conflicts with
# personal libraries from any R installation in the host environment

cat > ${workdir}/rsession.sh <<END
#!/bin/sh
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export R_LIBS_USER=${HOME}/R/rocker-rstudio/4.3
exec /usr/lib/rstudio-server/bin/rsession "\${@}"
END

chmod +x ${workdir}/rsession.sh

export SINGULARITY_BIND="${workdir}/run:/run,${workdir}/tmp:/tmp,${workdir}/database.conf:/etc/rstudio/database.conf,${workdir}/rsession.sh:/etc/rstudio/rsession.sh,${workdir}/var/lib/rstudio-server:/var/lib/rstudio-server"

# Do not suspend idle sessions.
# Alternative to setting session-timeout-minutes=0 in /etc/rstudio/rsession.conf
# https://github.com/rstudio/rstudio/blob/v1.4.1106/src/cpp/server/ServerSessionManager.cpp#L126
export SINGULARITYENV_RSTUDIO_SESSION_TIMEOUT=0

export SINGULARITYENV_USER=$(id -un)
readonly PORT=$(python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')
cat 1>&2 <<END
1. SSH tunnel from your workstation using the following command:

   ssh -N -L ${LOCALPORT}:${HOSTNAME}:${PORT} ${SINGULARITYENV_USER}@${SLURM_SUBMIT_HOST}.seawulf.stonybrook.edu

   and point your web browser to http://localhost:${LOCALPORT}

When done using RStudio Server, terminate the job by:

1. Exit the RStudio Session ("power" button in the top right corner of the RStudio window)
2. Issue the following command on the login node:

      scancel -f ${SLURM_JOB_ID}
END

singularity exec --cleanenv ${RSTUDIO_DIR}/rstudio_rocker_devel.sif \
/usr/lib/rstudio-server/bin/rserver --www-port ${PORT} \
--server-user ${SINGULARITYENV_USER} \
--rsession-path=/etc/rstudio/rsession.sh

After submitting this job with "sbatch", check the log file for instructions on how to set up an ssh tunnel from your machine to the SeaWulf compute node that your job is running on:

 

1. SSH tunnel from your workstation using the following command:

   ssh -N -L 8003:sn139:47255 NetID@login2.seawulf.stonybrook.edu

   and point your web browser to http://localhost:8003

When done using RStudio Server, terminate the job by:

1. Exit the RStudio Session ("power" button in the top right corner of the RStudio window)
2. Issue the following command on the login node:

      scancel -f 990704

 

Follow the above instructions, and you should see something like the following in your browser:
 

You may now write or execute R code using the Rstudio GUI.  Note that several standard R packages are installed into this Singularity image, but you may install more if you wish (see below).

 

Note that, by default, only your home directory will be available in the Rstudio Singularity container.  If you need to access additional directories, they can be bound into the container by either using the "--bind" flag or by adding the desired paths to the SINGULARITY_BIND environment variable that is set in the example script above. For example, to bind your scratch space, you could add it to the bind paths via the following:
 

export SINGULARITY_BIND="${workdir}/run:/run,${workdir}/tmp:/tmp,${workdir}/database.conf:/etc/rstudio/database.conf,${workdir}/rsession.sh:/etc/rstudio/rsession.sh,${workdir}/var/lib/rstudio-server:/var/lib/rstudio-server,/gpfs/scratch/netid:/scratch"

(replace "netid" with your NetID)

The above will take your scratch directory and bind it to "/scratch" within the Rstudio container.

 


Installing R Packages

If you require the use of packages outside of those normally distributed with R, you can use the install.packages command in both R and rstudio. When doing so, make sure to install your packages to a location somewhere in your home directory:

module load R/4.2.1

R

>install.packages("Package", lib = "/gpfs/home/$USER/R_packages")

The above command tells R to install a package called "Package" in a subdirectory called "R_packages" within the user's home directory.  Please note that the "R_packages" directory must exist and be writeable before installing the package.

Note that if you are using the GUI version of R accessed via the instructions above, the location where new R packages will be installed is automatically set to:

/gpfs/home/$USER/R/rocker-rstudio/4.3

No additional library path information is required to be specified when installing packages via Rstudio.  We recommend that users keep this directory separate from any other directories used to store libraries for other versions of R.

For More Information Contact


IACS Support System

Still Need Help? The best way to report your issue or make a request is by submitting a ticket.

Request Access or Report an Issue