openstack-bright71-3.png

Integrated Metrics

Bright’s management interface keeps an eye on the OpenStack software, the servers, the network, and the operating system so you can rest assured your OpenStack environment is running as it should.  From the physical to the virtual layer, we deliver monitoring capability through a unified, single pane of glass management console.  We monitor the entire stack, including:

  • Physical hardware
  • Physical operating systems
  • System-level services running on physical hardware (e.g. databases, load balancers)
  • Hypervisors
  • Virtualized hardware
  • Virtualized operating systems
  • Virtualized system-level services

A wide range of monitoring and “healthcheck” metrics is delivered via the Bright management console.  In addition to the broad range of cluster-level and device-level metrics managed by Bright’s core platform, Bright OpenStack also provides:

  • Hypervisor metrics (e.g. virtual network interfaces, virtual block devices, vCPU utilization)
  • Tenant / project metrics (e.g. number of VMs running, number of floating IPs in use)
  • Virtual machine metrics

 

Custom Metrics

In addition to the default metrics, you can easily add custom metrics for monitoring by using a custom metric collector script. This is a very simple script that captures a value and presents it in a consistent format to Bright OpenStack. Examples of custom metrics include values that can be read from an application or from a device such as a UPS, storage unit, firewall device, tape robot, SAN switch or KVM switch. 

 

Visualization with Graphs

openstack-bright71-4.png
Many features of the graphs can be customized. For example, graph line color and style, graph filling color and style, and graph transparency can all be configured.

All available metrics can be visualized using graphs. In the monitoring visualization window, multiple graphs can be shown simultaneously. A new graph is created by simply dragging a metric from the metrics tree into an empty graph area. Metrics can also be dragged into existing graph areas to allow for visual comparison between multiple metrics.

You can easily zoom in and out of graphs by dragging your mouse over an area of the graph. The monitoring system will then retrieve the required data automatically to rebuild the graph at a smaller or larger scale. Many features of the graphs can be customized. For example, graph line color and style, graph filling color and style, and graph transparency can all be configured.

All configurations of the monitoring visualization window can be saved for future use. So if you have built up an 8 x 6 matrix of 48 different graphs — each with its own customized color scheme — you can save this configuration and load it quickly later.

 

Configuration of the Monitoring System

The Bright monitoring system is fully configurable to match your needs and preferences. Some examples of configurable settings include:

  1. Which default and custom metrics to monitor. For example, you can stop certain metrics from being sampled, but you can also just stop metrics from being stored. The latter means that you are saving on storage while you are still able to visualize 'current' values. You can also still define thresholds on metrics you are not storing.
  2. How often to sample each metric. For example, you may want to sample CPU temperature values every minute, but fan speed values only every 10 minutes.
  3. How long to keep metrics data. For example, you may not be interested in disk performance metrics older than 3 months, but you may be interested in cluster load values over the lifetime of the cluster.
  4. How to consolidate each metric over time. For example, you may wish to keep used swap space values of nodes in the node category "large memory nodes" over the lifetime of the system, whereby values of the last 30 days should not be consolidated, but values older than 30 days may be averaged per hour, and values older than 90 days may be averaged per day.

 

Monitoring Architecture

All monitoring data is either sampled locally by the cluster management daemon (CMDaemon) on each regular and head node, or it is sampled directly from the BMC through the IPMI or iLO interface. In both cases, sampling is optimized for minimal resource consumption. For example, the CMDaemon samples all metrics in one process, without forking additional processes, whereas sampling through the IPMI or iLO interface happens out-of-band.

The CMDaemon on the head node periodically collects the data from the CMDaemons on the other nodes and stores it as raw data in the raw database hosted on the head node. The data is subsequently consolidated into the consolidated database, which is also hosted on the head node.

When the cluster management GUI generates a graph or a Rack View, it requests the required data from the CMDaemon on the head node, which reads it from the consolidated database.