A Brief Overview of Department High Performance Computing Resources

Introduction

  • Many computations you need to do can be done on your desktop or laptop computer.
  • For some computations you need more:
    • better tools for managing long running computations
    • more CPUs
    • more memory
  • A number of options are available to department students and faculty.
  • A some exciting new options have recently come on line
  • This presentation will summarize the available options and give some pointers on what can be done with them.
  • I will give a brief overview, and Kate and Matt will go into more detail on a few areas.

Resources Managed by CSG

Machines Available Under the Standard CSG Login

The Division of Mathematical Sciences (DIVMS) Computer Support Group (CSG) manages a number of Linux systems that

  • are available under a common login
  • share user home directories via NFS

Local disk storage is available as /var/scratch (not backed up)

The available machines include

  • Computer Lab Machines l-lnx200.stat.uiowa.edu through l-lnx218.stat.uiowa.edu located in the 346SH lab.
    • Of these, the 4 highest numbered machines are identified as Big Job machines that can be accessed as statbigjobs.stat.uiowa.edu
  • The research machine r-lnx400.stat.uiowa.edu

Computer Lab Machines

  • Purchased primarily with student computer fees
  • Refreshed every 3-4 years
  • Current hardware:
    • One quad-core Intel CPU
    • 16 Gb RAM

r-lnx400

  • Purchased from department research funds
  • 64 cores, AMD processors
  • 295 Gb RAM
  • May be restricted to department students and faculty.
  • Will be retired soon.

Common Software

  • Linux operating system
  • Compilers, editors, standard libraries, R
  • R package collection maintained by the department.
  • Some additional software maintained by the department.
  • Updated and accelerated R builds.

ITS HPC Resources

  • The ITS HPC group manages two Linux clusters for the University community
    • NEON
    • ARGON
  • Departments and research groups can purchase nodes in these clusters (Investors).
  • Investors have priority/exclusivity
  • Small experiments and software installation can be done on the login nodes.
  • Computations on the compute nodes are managed by Sun Grid Engine (SGE).
  • Documentation is available on the HPC Wiki.

NEON Cluster

  • NEON came on line at the end of 2013.
  • NEON consists of
    • 131 64GB Nodes, 2.6GHz 16 Core (Standard Nodes)
    • 17 256GB Nodes, 2.6GHz 16 Core (Mid-Memory Nodes)
    • 9 512GB Nodes, 2.9GHz 24 Core (High-Memory Nodes)
    • 29 Xeon Phi 5110P Accelerator Cards
    • 3 Nvidia Kepler K20 Accelerator Cards
  • Initially NEON was only available to investors.
  • All students and faculty can request an account.
  • Our department purchased one mid-memory node with a Xeon Phi and an Nvidia card and one high-memory node.
  • Our nodes are neon-kp-mm-compute-4-3.local and neon-hm-compute-5-29.local; our job queue is LT.
  • Apply for an account at the ITS account web page.

ARGON Cluster

  • ARGON came on line in summer of 2017.
  • Our department purchased two large-memory nodes with graphics cards.
  • The department nodes are available in the LT queue.
  • All students and faculty can request an account.
  • Apply for an account at the ITS account web page.

More NEON/ARGON Resources

Updated and Accelerated R Builds

  • I maintain builds of the current and development versions of R on the CSG-managed machines.
  • The builds are updated at the beginning of the month.
  • These builds are available as
/group/statsoft/R-patched/build/bin/R
/group/statsoft/R-devel/build/bin/R
  • These builds use the default reference BLAS implementation.
  • There are also builds using the sequential or threaded Intel MKL BLAS
  • For the current release these are available as
/group/statsoft/R-patched/build-MKL-seq/bin/R
/group/statsoft/R-patched/build-MKL-thr/bin/R
  • The development MKL builds are available as
/group/statsoft/R-devel/build-MKL-seq/bin/R
/group/statsoft/R-devel/build-MKL-thr/bin/R

Simple Examples

N <- 1000
X <- matrix(rnorm(N^2), N)
XX <- crossprod(X)
system.time(for (i in 1:5) crossprod(X))
system.time(for (i in 1:5) X %*% X)
system.time(svd(X))
system.time(for (i in 1:5) qr(X))
system.time(for (i in 1:5) qr(X, LAPACK=TRUE))
system.time(for (i in 1:20) chol(XX))

r-lnx400

  Reference MKL-seq MKL-thr
crossprod(X) 4.460 2.369 0.177
X %/% X 8.096 3.717 0.377
svd(X) 6.703 3.166 1.161
qr(X) 4.496 2.603 2.748
qr(X, LAPACK = TRUE) 5.302 2.028 1.933
chol(X) 6.038 1.627 0.403

l-lnx200

  Reference MKL-seq MKL-thr
crossprod(X) 2.081 0.391 0.134
X %/% X 3.353 0.736 0.239
svd(X) 3.313 1.040 0.503
qr(X) 2.203 1.076 1.108
qr(X, LAPACK = TRUE) 2.610 0.826 0.689
chol(X) 2.830 0.570 0.191

NEON

  • NEON seems to be using the reference BLAS.
  • Will look into whether a version linked against MKL can be made available.

Summary of Resource Features

  • Basic features and a speed comparison using a convolution example:
convolve <- function(a, b) # from the extending R manual
{
    a <- as.double(a)
    b <- as.double(b)
    na <- length(a)
    nb <- length(b)
    ab <- double(na + nb)
    for(i in 1 : na)
        for(j in 1 : nb)
            ab[i + j] <- ab[i + j] + a[i] * b[j]
    ab
}

x <- as.double(1:1600)
system.time(convolve(x, x))
  Mem (Gb) Cores GPU? Phi? Conv
l-lnx200 16 4     8.273
l-lnx217 16 4 yes   8.083
r-lnx400 495 64     12.831
Beowulf master 16 8     16.413
Beowulf node 8 4     16.109
HELIUM node 24 12     11.985
NEON node 1 264 16 yes yes 6.451
NEON node 2 512 24     NA

Author: Luke Tierney

Created: 2017-07-12 Wed 10:22

Emacs 24.5.1 (Org mode 8.2.10)

Validate