1. Class Calendar & Location

The 1 day course will be held 9a-5p, Friday, Oct 11th for the School of Education’s entering Graduate Students. It will be held in Rm 2024 of the Education Bldg. Lunch will be provided.

2. Target Audience

This class was designed for faculty, postdocs, and graduate students who are working on research projects that require the use of a Linux compute cluster as opposed to your Mac or Windows PC and want a quick introduction to cluster computing with Linux. It assumes that the participants will be naive Linux users but with some idea of the problem they want to address.

This is not a Computer Science course. We will not be teaching you how the engine works; we will be teaching you how to drive.

3. General Information

This is a one-day class that teaches the basics of the Linux on UCI’s HPC compute cluster. It will cover general outline of the Linux Operating system, bash shell commands, what a cluster is, and good manners for using it, as well as some very primitive programming in bash, Perl, and an introduction to the R language.

Much of this course was taken from a highly related course called An Introduction to BioLinux. If you see references to BioLinux that’s why. We will not be touching on many issues related to the Bioinformatics part of that course.

If you have specific questions about the topics in this course, please see the documentation below or contact Harry Mangalam <harry.mangalam@uci.edu>.

4. Class Lecture Slides and Tutorial Notes

You can determine if you’re interested in taking the class by reviewing the class slides and tutorial scripts below.

As preparation for the Linux part, I strongly suggest viewing the Software Carpentry introduction to the shell videos

4.1. Tutorial Data

Input and example data files for the tutorials are stored here, which is a browsable directory. If there is a file called MANIFEST, please read it for descriptions of the files you find there.

5. Linux and the HPC cluster

5.1. Lecture

Introduction to Linux and why you should use it. Overview of commands and getting around. What is/isn’t a cluster, logging in with ssh, setting up your environment, text editors, quotas, data management, graphics, useful bash shell commands, environment variables, pattern matching and regular expressions, programs: how to find them, find out about them, run them, simple debugging.

The Bash shell and simple bash programming. Variables, loops, logic tests. The Perl programming language and what it can do for you. The R programming language: Object model for data, data types, manipulation of data, input, output.

5.2. Tutorial

Logging in with ssh, commandline editing, setting your prompt, transferring data in, editing, de/compressing, unpacking, basic bash and utility commands, cluster status commands, software modules, Grid Engine commands. Introduction to simple data manipulation and scripting/ programming with bash, Perl, and R.

Bring your specific problems to discuss with the Instructor.

6. Tutorial Data Sets

The data set directory is browsable here, although most of the data is for the Bioinformatics tutorial.

7. Prerequisites for the tutorial

  • a Mac, PC, or Linux laptop with wifi pre-registered with the UCI Mobile network.

  • If Mac:
    CyberDuck or other graphical file transfer program (GFTP) + the Mac x2go client.
    Recent OSX releases do not include X11 compatibility software, now called XQuartz (still free). If you have not done so already, please download and install it. The x2go software will not work without it. We have had problems with the 4.0 x2go release; please use the 3.99.2.1 release linked above. To get it working correctly with XQuartz, please start XQuartz first, THEN start x2go.

  • If Windows:
    the putty terminal program
    CyberDuck, WinSCP or other GFTP client.
    Optionally, the the Windows x2go client, altho we have discovered that there are some applications that refuse to work with it.

  • If Linux:
    The x2go client allows you to view graphical output from the HPC cluster.