cw-healthctl#
NAME
cw-healthctl -- Query and modify health checks for the cluster.
USAGE
cw-healthctl[-h][-v][-q][[-c | --config] CONFIG][--base-url URL][[-u | --user] USER[:PASSWD]][--human | --json | --csv | --table][--pretty | --no-pretty][--show-uids][[-i | --ids] CHECKS | -a | --all]{list,ls, create,mk, clone,cp, update,up, delete,rm}
OPTIONAL ARGUMENTS
- -h, --help
Print usage message and exit. Ignore trailing args, parse and ignore preceding args.
- -v, --verbose
Increase verbosity.
- -q, --quiet
Decrease verbosity.
- -c, --config CONFIG
Specify a client configuration file CONFIG.
- --show-uids
Do not try to make the output more human readable.
- -a, --all
Interact with all health checks (default for list).
- -i, --ids CHECKS
A comma-separated list of health checks to query or modify. Values can include name, UID, or truncated UID.
- --reset-all
Revert health checks to the default definitions in the health repo.
ARGUMENTS TO OVERRIDE BASIC CONFIGURATION DETAILS
- --base-url URL
Specify the base URL of the ClusterWareAI REST API.
- -u, --user USER[:PASSWD]
Masquerade as user USER with optional colon-separated password PASSWD.
FORMATTING ARGUMENTS
- --human
Format the output for readability (default).
- --json
Format the output as JSON.
- --csv
Format the output as CSV.
- --table
Format the output as a table.
- --pretty
Indent JSON or XML output, and substitute human readable output for other formats.
- --no-pretty
Opposite of --pretty.
CHECK FIELDS
Each health check is described by the following fields:
- name
Required. Unique identifier for the check.
- command
Required. Shell command executed on each target node. The exit code is interpreted Nagios-style: 0 = OK, 1 = WARNING, 2 = CRITICAL, 3 = UNKNOWN.
- interval
Seconds between successive runs of the check. Must be a positive integer. If omitted on
create, the cluster fielddefault_health_intervalis used.- timeout
Seconds after which a running check is killed. Must be a positive integer. If omitted on
create, the cluster fielddefault_health_timeoutis used.- labels
List of label strings used to group checks and to target them via the
%labelselector syntax (see SELECTING CHECKS BY LABEL below).- description
Free-form text describing the check.
- flap_thresholds
Dictionary controlling when a flapping check signals a failure. Supported keys are
fail_streak(number of consecutive failures) andfail_percentage(percentage of failures in a rolling window).
ACTIONS
- clone (cp) [--content JSON | INI_FILE] [NAME=VALUE ...]
Copy health check(s) to new identifiers.
- --content JSON | INI_FILE
Overwrite fields in the cloned check.
- create (mk) [--content JSON | INI_FILE] [NAME=VALUE ...]
Add a health check.
- --content JSON | INI_FILE
Load this content into the database as a health check. The content may be JSON, an INI file, a YAML document, or a YAML stream containing multiple checks.
- delete (rm)
Delete health check(s).
- list (ls) [--long | --long-long] [--nodes [NODE ...]] [--show-labels]
Show information about health check(s).
- -l, --long
Show a subset of all optional information for each check.
- -L, --long-long
Show all optional information for each check.
- --nodes [NODE ...]
Instead of listing checks, list the checks assigned to the given node(s). Node specs may include ranges such as
n[0-1]. An empty value means all nodes.
- --show-labels
When used with
--nodes, annotate each check with its labels.
- update (up) [--content JSON | INI_FILE] [NAME=VALUE ...]
Modify health check fields.
- --content JSON | INI_FILE
Overwrite fields in the specified check(s).
SELECTING CHECKS BY LABEL
Positional target values that begin with % are treated as label selectors
rather than check names. A bare %label matches any check carrying that
label; multiple %label tokens (or a quoted selector expression) are
flattened into an OR-set: a check matches if any of its labels appears in
the set. Boolean connectives (and, or, parentheses) are accepted for
syntactic compatibility with cw-nodectl --selector but do not restrict the
match.
For example, given checks check1 (labels: cpu) and check2 (labels:
gpu):
cw-healthctl -i %cpu ls # matches check1
cw-healthctl -i %gpu ls # matches check2
cw-healthctl -i '%cpu,%gpu' ls # matches check1 and check2
EXAMPLES
cw-healthctl list
List all health checks.
cw-healthctl create name=check_zombies command='check_zombie.py -w 10 -c 20' interval=30 timeout=10 labels=cpu
Add a new check that runs every 30 seconds with a 10-second timeout and has the
cpulabel.
cw-healthctl --content @checks.yaml create
Create one or more health checks from a YAML file. The file may contain a single document, a list of checks, or a multi-document YAML stream.
cw-healthctl -i check_zombies update interval=60
Change the interval of
check_zombiesto 60 seconds.
cw-healthctl -i %cpu ls -L
Show full details for every check with the
cpulabel.
cw-healthctl ls --nodes n[0-3] --show-labels
List the health checks assigned to nodes n0 through n3, annotated with their labels.
RETURN VALUES
Upon successful completion, cw-healthctl returns 0.
On failure, an error message is printed to stderr and
cw-healthctl returns 1.