Kim Public Wiki: UsagePolicy

UsagePolicy

Computer Access

In order to best accommodate all users in an equitable fashion, we've devised the following computer access policy. Please address questions or comments on our policy to StephenFisher.

Application Restrictions

VNC: should only be run on kimclust15.
Mathematica: is only licensed to run on kimclust40.

Machines

Detailed machine specs are accessible on our Private wiki ( Private:LabHardware ). The cup.pl command should also give you sufficient hardware information to plan where to run your jobs.

Only run jobs on kimclust35 - 40 and kimclust50-54, ask before running large jobs on 50-54.

Disk Partitions

Detailed disk specs are accessible on our Private wiki ( Private:LabHardware ).

Contact Stephen if you need access to any specific data partition(s). You can use 'df -h' to view the amount of storage on all disk partitions.

/home (aka /home15): Resides on kimclust15 and should not contain large data sets.

/scratch: A partition available on kimclust35 through kimclust40. It should be used to store data sets while being processed.

/local[50-54]: High speed storage available on the big compute machines.

64bit Computing

All workstations are set up for 64bit computing. These machines are capable of running 32bit or 64bit applications. To make use of the 64bit processors you need to make sure your application is compiled using a 64bit compiler, which is the default compilers on these machines. If you require 32bit applications, the 32bit version of /usr/local is located at /usr/local32. In order to access these 32bit programs on the 64bit machines, you must reference them via /usr/local32/bin/<...>.

Load Balancing

Currently the best way to run jobs across the cluster is to ConfiguringSSH to allow passwordless logins and then to use ssh to remotely launch jobs. Since this will not balance the loads across the cluster, you can view the machine loads with the "cup.pl" command as described in CurrentUsage.

In the following example, the user is logged into kimclust15 and uses ssh to launch the command 'date' on kimclust38. The output from the command will be displayed in the current terminal window.

[fisher@kimclust15]$ ssh kimclust38 date

Nice

The 'nice' command can be used to change the priority of a command while it is running. With larger nice levels, fewer the CPU cycles are allocated to the job, thus giving more priority to other jobs with lower nice levels. 19 is the maximum nice level. The syntax of the nice command is as follows.

[fisher@kimclust15]$ nice -n 19 "your command here"

The following example uses the nice command to run 'date' on kimclust38 with a nice level of 19.

[fisher@kimclust15]$ ssh kimclust38 nice -n 19 date

What machine should I use to run my job?

First off, do not ever run jobs on the servers, unless the job is only doing file management. For example, if you have a large directory of files on /data15 that you want to tar, then log into kimclust15 to do the tar'ing. Otherwise you will be pulling the files across the network to your current machine, tar'ing then and then sending them back across the network to kimclust15. The same applies to /data, which is mounted on kimclust10.

If you are doing anything else, pick one of the cluster workstations (see above). Run cup.pl to see who's using what machines and ideally pick a machine not currently being used (ie unloaded). If all machines are being used, look at the number of cpu on a machine and the load. Each cpu core can handle a load of 1 and each job running on a machine adds 1 to the load (ie a 4 core machine can handle 4 jobs or a load of 4). While you can run more jobs on a machine, they will hinder both your jobs and the other jobs running on the machine, unless you set the 'nice' value significantly lower than the existing jobs (kimclust35 - 40). Setting the nice value lower, will also mean your job will run slowly.

Check out the amount of RAM on the machines (also shown in cpu.pl). If your job is RAM intensive, then that may be a more important metric than the load and you need to pick a machine accordingly. Most machines have a swap drive that is equal to the amount of RAM shown by cup.pl. So if cup.pl shows a machine with 16 GB of RAM, then it's likely the machine has another 16 GB of swap space (you can view the amount of swap using 'top'). However, if a exceeds the built-in RAM and uses swap space then it's going to run much, much slower, unless the program was specifically designed to use swap space. If the program tries to use more memory than RAM + swap space, the program will be killed by the kernel. There are 5 high RAM machines, kimclust50-54. Jobs requiring >16 GB of RAM should use one of them. Check with Jamie though before using 52-54 to make sure you won't be getting in the way of RNA Seq pipeline analysis.

Lastly, if your program is repeatedly reading the same files or saving a lot of temporary data to disk that it's going to read back in and process, consider using the local drive on the machine (/tmp, /scratch, or /local5x). Most machines have large, local hard drives. Use these for temporary files, that you can delete once your program finishes running. It will allow your program to run faster and reduce the load on the rest of the network. You can view the amount of local disk space using cup.pl and access the disk space by creating a directory in /tmp or asking Stephen to create a directory for you on /scratch or /local5x. Note that you can not access these files on any other computer. Also, all files in /tmp are automatically deleted every time the computer reboots, so plan accordingly (/scratch and /local5x however are not).

Kim Lab

Evolutionary and Molecular Biology of the Cell