Managing the Impact of Interactive Use, Part 2: Interactive Workloads via Bright

By Ian Lumb | October 16, 2013

Because the impact of unmanaged interactive sessions can be significant,[1] the concept of login nodes in Bright Cluster Manager was introduced in Part 1 of this series.[2] Although login nodes address many considerations relating to interactive use, they are designed to do so in a limited way. For example, in Part 1, the following consideration was outlined (emphasis added here):

Read More >

How to manage Slurm jobs using the Bright Cluster Management Shell (CMSH)

By Robert Stober | March 20, 2013

This article show how you can easily manage Slurm jobs using the Bright Cluster Management Shell (CMSH). In job mode, the CMSH allows you to perform the same job management operations as the CMGUI through a convenient shell interface. For an example of managing jobs using the Bright CMGUI, check out my previous article on this topic.

Read More >

How to manage SGE (and other workload manager) jobs using the Bright CMGUI

By Robert Stober | March 11, 2013

The Bright Cluster Manager CMGUI makes tasks intuitively easy. This article shows how you can view and control workload manager jobs using the Bright CMGUI. I am using an OGS (SGE) job to provide examples, but Bright works the same way with all Bright's supported workload managers: PBS Professional, Slurm, Univa Grid Engine, LSF, openlava, TORQUE/Moab, TORQUE/Maui.

Read More >

Slurm 101: Basic Slurm Usage for Linux Clusters

By Robert Stober | March 07, 2013

This article describes basic Slurm usage for Linux clusters. Brief "how-to" topics include, in this order:

  • A simple Slurm job script
  • Submit the job
  • List jobs
  • Get job details
  • Suspend a job (root only)
  • Resume a job (root only)
  • Kill a job
  • Hold a job
  • Release a job
  • List partitions
  • Submit a job that's dependant on a prerequisite job being completed
Are you a cluster admin? Download our eBook on using Slurm »
 
OK. Let's get started.
Read More >

How to Submit a Simple Slurm GPU job to your Linux cluster

By Robert Stober | February 18, 2013

This article shows you how to submit a simple Slurm GPU job to your local cluster using Bright Cluster Manager. Dead easy.

Read More >

How submit an interactive job using Slurm

By Robert Stober | November 20, 2012

Here is an easy way to submit an interactive job to Slurm, using srun. You can even submit a job to a cloud node this way.

Read More >

How to submit SLURM jobs to cloud nodes using cmsub with Bright

By Robert Stober | September 28, 2012

Stresscpu is a simple CPU intensive test job that you can use to submit test jobs to your cluster. The distribution includes everything needed to run the job including the submission script and the executable. By default, it submits two test jobs to the SLURM partition specified on the command line. This article describes how to use it to submit jobs to run on cloud-based nodes.

Read More >

How to easily switch from one workload manager to another with Bright

By Robert Stober | August 06, 2012

When you install Bright Cluster Manager on your system, Bright automatically installs the workload manager of your choice, from among the following:

Read More >

How to run an OpenMPI job in Bright Cluster Manager through Slurm

By Robert Stober | July 23, 2012

Here's a simple example of how to run an OpenMPI program using openmpi in Bright Cluster Manager, through the Slurm workload manager.

Write a job script.


This script requests that the job be run on three nodes with 2 tasks per node. STDOUT and STDERR will be written to the file "hello.out". Note that OpenMPI is tightly integrated with Slurm. Open MPI automatically obtains both the list of hosts and how many processes to start on each host from Slurm so we don't need to specify the --hostfile, --host, or -np options to mpirun.

$ cat hello.sh

#!/bin/sh
#SBATCH -o hello.out
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=2
mpirun hello

Submit the job

$ sbatch hello.sh
Submitted batch job 35

Check job status

$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
35 defq hello.sh rstober R 0:04 3 atom[01-03]

Get detailed job status

$ scontrol show job 35
JobId=35 Name=hello.sh
UserId=rstober(1001) GroupId=rstober(1001)
Priority=4294901758 Account=(null) QOS=normal
JobState=COMPLETED Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
RunTime=00:00:02 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2012-07-17T07:41:39 EligibleTime=2012-07-17T07:41:39
StartTime=2012-07-17T07:41:39 EndTime=2012-07-17T07:41:41
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=defq AllocNode:Sid=atom-head1:24006
ReqNodeList=(null) ExcNodeList=(null)
NodeList=atom[01-03]
BatchHost=atom01
NumNodes=3 NumCPUs=10 CPUs/Task=1 ReqS:C:T=*:*:*
MinCPUsNode=2 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) Gres=(null) Reservation=(null)
Shared=0 Contiguous=0 Licenses=(null) Network=(null)
Command=/home/rstober/slurmhello.sh
WorkDir=/home/rstober

The job has completed

$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)

Here's the job output

$ more hello.out
Hello world from process 001 out of 006, processor name atom01
Hello world from process 002 out of 006, processor name atom02
Hello world from process 003 out of 006, processor name atom02
Hello world from process 000 out of 006, processor name atom01
Hello world from process 004 out of 006, processor name atom03
Hello world from process 005 out of 006, processor name atom03

High Performance Computing eBook
Read More >

How to quickly configure a number of Torque job slots per server

By Robert Stober | July 17, 2012
This article is about how to configure a number of Torque job slots per server, but before I begin, I would like to mention the alternatives you have when you use Bright.
Read More >
COMMENTS