Bright Computing Logo

Bright Cluster Manager - Workload Managers & Queuing Systems

Home > Products > Workload Management

Workload Management

Bright Cluster Manager® offers a wide choice of workload managers, also known as queuing systems or queueing systems. Most leading workload managers are integrated into Bright Cluster Manager and many are even included on the Bright Cluster Manager DVD, either completely free of charge, or with a free, temporary evaluation license.

Benefits of Workload Manager Integration

Bright Cluster Manager is integrated with most leading workload managers. The integration exists on multiple levels, providing many powerful benefits to system administrators and users:

  1. Automatic installation — During the installation of Bright Cluster Manager, you can select from a list of all available workload managers. The selected workload manager will then automatically be installed in the right locations on the head node and regular node images.
  2. Automatic configuration — During the life time of the cluster, from installation to expansion to day-to-day management, Bright Cluster Manager inserts and updates relevant workload manager configurations.
  3. Manageable from CMGUI and CMSH — Through CMGUI and CMSH it is possible to view and manipulate jobs, and to configure the workload managerqueuing system or queueing system without having to learn its specific configuration commands or files. The following actions are available in CMGUI and CMSH for queues: add, remove, edit. The following actions are available in CMGUI and CMSH for jobs: show, remove, hold, release, suspend and resume.
  4. Viewable from the User Portal — In the User Portal, as user can see the status of the workload manager and a summary of relevant workload management statistics. He can also see his own jobs in the available queues.
  5. Manageable from the SOAP API — All actions and data available through CMGUI and CMSH — including workload manager related actions — are also available through the Bright Cluster Manager SOAP API and its C++, Python and PHP bindings.
  6. Failover managed by Bright — Bright Cluster Manager's built-in, native failover capability also manages the seamless failover of the workload manager.
  7. Workload manager statistics — Many workload manager metrics are sampled and analyzed by Bright Cluster Manager, over the life time of the cluster. Examples include number of completed, failed, queued and running jobs; estimated delay; and average job duration.
  8. Health checking — One of Bright's most powerful types of health check is the pre-job health check. This type checks the health of nodes just before a job is submitted to them. This ensures that the job does not crash due to node health problems (also called Black Hole Node Syndrome). This kind of health check is only possible when the cluster manager and the workload manager work closely together.

Integrated Workload Managers

The following workload managersqueuing systems or queueing systems are integrated in Bright Cluster Manager:

  1. Grid Engine — Grid Engine is a powerful workload managerqueuing systems or queueing systems which includes both queuing and scheduling functionality. Both Open Grid Scheduler and Univa Grid Engine are integrated in Bright Cluster Manager.
  2. LSF — LSF (Load Sharing Facility) is a commercial, proprietary workload manager from Platform Computing.
  3. Maui Cluster SchedulerMaui is a powerful open source job scheduler which can provide scheduling intelligence to TORQUE.
  4. Moab Workload ManagerMoab is a powerful commercial job scheduler from Adaptive Computing which can provide scheduling intelligence to TORQUE and other workload managers.
  5. openlavaopenlava is an open source fork of Platform Lava, which is based on Platform LSF™ version 4.2.
  6. PBS ProfessionalAltair’s PBS Professional (Portable Batch System Professional) is a powerful commercial workload manager which includes both queuing and scheduling functionality. It is used in thousands of installations and supported by Altair in 20 countries worldwide.
  7. SLURMSLURM (Simple Linux Utility For Resource Management) is an open source resource manager with a plug-in architecture, used in many large installations at the US National Labs. It includes both queueing and scheduling functionality.
  8. TORQUE Resource ManagerTORQUE (Terascale Open-Source Resource and QUEue Manager) is an open source, distributed resource manager originally based on OpenPBS. It has limited scheduling intelligence built in, which is why it is usually used in combination with the Maui or Moab Cluster Schedulers.

Alternatively, you can easily install and configure other workload managersqueuings systems or queueing systems with Bright Cluster Manager, but you will not enjoy the benefits from the integration with Bright Cluster Manager.

Workload Management Features

All integrated workload managers offer at least the following features:

  1. Fairness policies — define what a cluster owner considers as a fair use of available resources.
  2. Advanced reservation — guarantees the availability of a set of resources at a particular time.
  3. Job priority policies and configurations — determines in which order jobs should be run to achieve some pre-defined fair-share policy.
  4. Quality of Services (QoS) support — allows jobs, users or groups to receive special treatment based on privileges and fairness policies.
  5. Multi-attribute fairshare — allows historical resource utilization information to be incorporated into job feasibility and priority decisions.
  6. Configurable node allocation policies — allow a site to specify how available resources should be allocated to each job.
  7. Multiple configurable backfill policies — allows a scheduler to make better use of available resources by running jobs out of order.
  8. System diagnostic support — provides commands for diagnosing system behavior.
  9. Allocation manager support and interface — manages resource allocations where a resource allocation grants a job the right to use a particular amount of resources (also known as allocation bank or CPU bank).
  10. Resource utilization tracking and statistics — provides extensive accounting facilities which allow resource usage to be tracked by resources (i.e., compute nodes), jobs, users, and other objects.
  11. Non-intrusive 'Test' modes — conducts scheduling cycles for testing as it would if running in normal or production mode, but without actually starting or modifying jobs.
 
 
Bright ROI Calculator
Next Steps

 

Product Features

Overview
Editions
Based on Linux
Intel Cluster Ready
Installation
Cluster Management GUI
Node Provisioning
Monitoring
Cloud Utilization
GPU Management
ScaleMP Management
Workload Management
Cluster Health Management
Advanced Features
User Portal
NVIDIA CUDA & OpenCL

Solutions

HPC Cluster
Hadoop Cluster
OpenStack

Customers

Customer Testimonials
Analyst Testimonials
Partner Testimonials

Resources

About Bright
Case Studies
Data Sheets
White Papers
Analyst Reports
Bright ROI Calculator
Bright Cluster Manager
HPC
Hadoop
OpenStack
Support
Product Demos
Webinars
Videos
Manuals

About

Bright
News
Events
Webinars
Awards
Press Center
Careers
Contact Us

Where to Buy

Where to Buy
Resellers Africa
Resellers Asia
Resellers Canada
Resellers Europe
Resellers Middle East
Resellers Russia
Resellers South America
Resellers USA

Contact us

+1 408 300 9448
info@brightcomputing.com
Twitter: @BrightComputing

Connect



 
 
Site Map | Legal | © 2009–2014 Bright Computing, Inc. All rights reserved.