Grafana Telemetry Dashboard#

The ICE ClusterWare ™ software enables data collection and monitoring via Telegraf, InfluxDB, and Grafana by default. The data collected includes standard metrics like CPU usage, memory usage, network usage, and so on. The ClusterWare plaform provides initial Grafana dashboards showing these metrics, such as cluster monitoring, node monitoring, GPU monitoring, syslog data, and auditing data.

You can configure the Telegraf Plugins to enable additional data collection options. You can also enable syslog data collection on compute nodes via rsyslog.

Caution

Collecting additional metrics may require additional disk space on your head nodes or in your cluster storage. Be sure to plan ahead for disk consumption before enabling additional metrics. See Required and Recommended Components for details.