Use Intel’s 20-Step Process to Choose a Management Solution for Your Cluster

By Ian Lumb | February 06, 2014

20 criteria to identify your solution for cluster management? Absolutely!

Read More >

How to run the Intel MPI Benchmark (IMB) on a Bright cluster

By Robert Stober | November 29, 2012

This article shows how to run the Intel MPI Benchmark (IMB) on a Bright cluster. In this example we will run the benchmark over low latency 10GbE interfaces.

Read More >

How to run an OpenMPI job in Bright Cluster Manager through Slurm

By Robert Stober | July 23, 2012

Here's a simple example of how to run an OpenMPI program using openmpi in Bright Cluster Manager, through the Slurm workload manager.

Write a job script.


This script requests that the job be run on three nodes with 2 tasks per node. STDOUT and STDERR will be written to the file "hello.out". Note that OpenMPI is tightly integrated with Slurm. Open MPI automatically obtains both the list of hosts and how many processes to start on each host from Slurm so we don't need to specify the --hostfile, --host, or -np options to mpirun.

$ cat hello.sh

#!/bin/sh
#SBATCH -o hello.out
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=2
mpirun hello

Submit the job

$ sbatch hello.sh
Submitted batch job 35

Check job status

$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
35 defq hello.sh rstober R 0:04 3 atom[01-03]

Get detailed job status

$ scontrol show job 35
JobId=35 Name=hello.sh
UserId=rstober(1001) GroupId=rstober(1001)
Priority=4294901758 Account=(null) QOS=normal
JobState=COMPLETED Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
RunTime=00:00:02 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2012-07-17T07:41:39 EligibleTime=2012-07-17T07:41:39
StartTime=2012-07-17T07:41:39 EndTime=2012-07-17T07:41:41
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=defq AllocNode:Sid=atom-head1:24006
ReqNodeList=(null) ExcNodeList=(null)
NodeList=atom[01-03]
BatchHost=atom01
NumNodes=3 NumCPUs=10 CPUs/Task=1 ReqS:C:T=*:*:*
MinCPUsNode=2 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) Gres=(null) Reservation=(null)
Shared=0 Contiguous=0 Licenses=(null) Network=(null)
Command=/home/rstober/slurmhello.sh
WorkDir=/home/rstober

The job has completed

$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)

Here's the job output

$ more hello.out
Hello world from process 001 out of 006, processor name atom01
Hello world from process 002 out of 006, processor name atom02
Hello world from process 003 out of 006, processor name atom02
Hello world from process 000 out of 006, processor name atom01
Hello world from process 004 out of 006, processor name atom03
Hello world from process 005 out of 006, processor name atom03

High Performance Computing eBook
Read More >
COMMENTS