Configure Sensu#

The ICE ClusterWare ™ software uses Sensu to run health checks on compute nodes configured to use the automated remediation service (ARS). As part of the configuration, a Sensu backend server is needed. The Sensu backend server should be configured on an administrative node or virtual machine separate from your ClusterWare head node(s).

  1. On an administrative node, install the Sensu backend. See the Sensu documentation for requirements and installation steps: https://docs.sensu.io/sensu-go/latest/operations/deploy-sensu/install-sensu/.

  2. On the Sensu backend server, create a cli2mqtt.yaml file for MQTT configuration. Contents of the file should be:

    ---
    type: Handler
    api_version: core/v2
    metadata:
       name: cli2mqtt
    spec:
      command: cli2mqtt.py
      type: pipe
      runtime_assets:
        - penguin-sensu-assets
      filters:
        - is_incident
      timeout: 5
    
  3. Download the penguin-sensu-assets-1.0.0.yaml file from the healthiso repo:

    wget http://<head node>/api/v1/repo/healthiso/content/penguin-sensu-assets-1.0.0.yaml
    
  4. Add the configuration to the Sensu backend:

    sensuctl configure -n --username <username> --password <password> --namespace default --url <Sensu server URL>
    sensuctl create -f penguin-sensu-assets-1.0.0.yaml
    sensuctl create -f cli2mqtt.yaml
    
  5. Download the Sensu assets from the healthiso repo:

    dnf install wget -y
    wget http://<head node>/api/v1/repo/healthiso/content/penguin-sensu-assets-1.0.0.tar.gz
    
  6. Extract the default health check bundle to the Sensu backend:

    mkdir psa
    cd psa
    tar -xvf ../penguin-sensu-assets-1.0.0.tar.gz
    
  7. Modify the /etc/mqttpublisher.conf file and add an entry to provide your MQTT password:

    mosquitto.pubpass = <MQTT password>
    
  8. Modify the defaults/check-bundle.yml file to match your desired configuration, then create the checks in Sensu.

    sensuctl create -f defaults/check-bundle.yml
    

    Tip

    Pay attention to which subscriptions each check is configured to use. The subscriptions map to the _ars_groups attribute set on compute nodes.

  9. Install the state machine library:

    wget http://<head node>/api/v1/repo/healthiso/content/ars_state_machine-${version}-py3-none-any.whl
    pip3 install ars_state_machine-${version}-py3-none-any.whl