Bright Computing Logo

Bright Cluster Manager - Automating Tasks with Thresholds & Actions on Linux Clusters

Home > Products > Management Automation

Cluster Management Automation

Cluster Management Automation is a very powerful feature for cluster administrators. It allows you to set a threshold for any metric and define any action to be taken when that threshold is exceeded. Any of the built-in or custom metrics supported by Bright Cluster Manager® can be used and any cluster management shell or Linux command or script can be used as an action.

Examples of Actions

Some examples of "actions" that can be configured with Bright Cluster Manager® include:

Examples of Rules

Bright Cluster Manager ScreenshotBright Cluster Manager Screenshot
A configuration wizard is available to guide you through the steps of defining a rule.

Some examples of "rules" that can be configured with Bright Cluster Manager include:

  • If the amount of free space in /home goes below 9.3 Gigabyte, send an email to administrator@localhost.
  • If the number of running jobs exceeds 120, log an event in the GUI event viewer.
  • If the temperature in any of the nodes in node category "Large SMP Nodes" exceeds 60 degrees Celsius, send an SMS text message to mobile phone number +1 123 123 1234 and shutdown the offending node.

This tool is very powerful and can be a real time-saver. For example, you can monitor the health of your cluster and take preemptive action when hardware shows signs of imminent failure, or you can monitor usage of your cluster and take preemptive action before the cluster runs out of resources.

A configuration wizard is available to guide you through the steps of defining a rule, which includes selecting a metric, defining a threshold and defining an action.

State Flapping

The Automated Cluster Management system is sophisticated and highly configurable. One example is its ability to deal with so-called "state flapping", which is a situation where a threshold is exceeded repeatedly within a short time frame. This can, for example, happen when a CPU temperature fluctuates around a configured threshold, potentially causing the system to send out many emails in a short time frame. The system is able to detect such a situation and can be configured exactly how to deal with it.

 
 
Quote
Next Steps

 

Home

Home page

Product Features

Overview
Editions
Based on Linux
Intel Cluster Ready
Installation
Cluster Management GUI
Node Provisioning
Monitoring
Cloud Bursting
GPU Management
ScaleMP Management
Workload Management
Cluster Health Management
Advanced Features
User Portal
NVIDIA CUDA & OpenCL

Customers

Customer Testimonials
Analyst Testimonials
Partner Testimonials

Where to Buy

Where to Buy
Resellers Asia
Resellers Canada
Resellers Europe
Resellers Middle East
Resellers Russia
Resellers South America
Resellers USA

Company

About
News
Events
Employment
Where to buy

Resources

Videos
Brochures
Analyst Reports

Contact us

+1 408 300 9448
info@brightcomputing.com
Twitter: @BrightComputing

Connect



 
 
Site Map | Legal | © 2009–2013 Bright Computing, Inc. All rights reserved.