Health API

The endpoints of the HiveMQ Health API provide operational information about your HiveMQ broker components and extensions.

With the Health API, you can capture snapshots that show the current state of health for each node in your HiveMQ cluster. The well-structured information the API provides helps you quickly identify potential issues and maintain the smooth operation of your HiveMQ platform deployment.

Configuration

Example Health API configuration

<hivemq>
  <!-- ... -->
  <health-api>
    <enabled>true</enabled>
    <listeners>
      <http>
        <port>8889</port>
        <name>health-api-listener</name>
        <bind-address>127.0.0.1</bind-address>
      </http>
    </listeners>
  </health-api>
</hivemq>

Table 1. .Health API configuration parameters
Parameter	Default Value	Required	Description
`enabled`	`false`		Enables or disables the use of the HiveMQ Health API. The Health API is disabled by default. To allow access to the Health API, set the `enabled` tag to `true`.
`listeners`			Configures one or more HTTP listeners to provide access to the HiveMQ Health API. `port`: The port on the local machine that listens for HiveMQ Health API requests. The default port for Health API HTTP listeners is `8889`. The `port` can be changed. `name`: Optional setting to define a name for the listener. Custom-defined listener names can be helpful when multiple listeners are in use. If no name is specified, HiveMQ uses the type of listener plus the port. For example, `http-listener-8889`. `bind-address`: The address on the local machine that accepts HiveMQ Health API requests. The default bind address is `127.0.0.1`. The bind address can be changed.

When the Health API is enabled, multiple endpoints are exposed on each configured HTTP listener:

/api/v1/health/: An endpoint that reflects the system-wide health status of all components in the selected HiveMQ deployment.
/api/v1/health/<component-name>: An endpoint for an individual broker component.
/api/v1/health/extensions: An endpoint that reflects the health of all extensions in the selected HiveMQ deployment.
/api/v1/health/extensions/<extension-id>: An endpoint for a individual extension component.

The Health API also provides two health group endpoints that assemble data from sets of components:

/api/v1/health/liveness: Liveness Check.
/api/v1/health/readiness: Readiness Check.

Health API endpoints are HTTP only and do not support TLS.

HTTP response

When called, each endpoint of the Health API returns an HTTP status code and a human-readable JSON response body with additional information.

The HTTP status code reflects the overall health of the selected health component or health group.
The response body contains a JSON payload with structured information.

Example response body JSON

{
  "status": "<status>",
  "details": {
    "<key>": "<value>"
  },
  "components": {
    "<component-name>": {
      "status": "<status>",
      "components": {
        "<component-name>": {
          "status": "<status>",
          "details": {
            "<key>": "value"
          }
        }
      }
    }
  }
}

Based on user feedback and continued development, upcoming versions of the HiveMQ Health API JSON payload are expected to include additional information about the health of individual components. These future additions can change the JSON payload.

Every health response includes a mandatory status and can optionally include additional details.

Health components can have nested components that follow the same structure as the parent component.
The status of the parent component is the aggregated status of all associated child components.
This aggregated status of the top component determines the HTTP status code of the response.

Table 2. Status values and HTTP status code mapping
Status	HTTP status code	Description
`UP`	`200`	The component is healthy.
`UNKNOWN`	`200`	The health status of the component is unknown.
`DEGRADED`	`200`	The component is in a degraded state. Immediate attention is required to prevent that the component progresses into a `DOWN` state.
`DEGRADED_SERVICE`	`200`	The service is degraded. Immediate attention is required to prevent that the component progresses into an `OUT_OF_SERVICE` state. For more information, see Readiness Check.
`DOWN`	`503`	The component is not healthy.
`OUT_OF_SERVICE`	`503`	The service is not available. For more information, see Readiness Check.

When you set up automated monitoring and operations, we recommend the use of the HTTP status code only.

System Health

The system health endpoint provides an aggregated status of all available health components and health groups.

Example JSON response for a healthy System component

{
  "status": "UP",
  "components": {
    "<component-name>": {
      "status": "UP"
    },
    "livenessState": {
      "status": "UP"
    },
    "readinessState": {
      "status": "UP"
    }
  },
  "groups": [ "liveness", "readiness" ]
}

Example JSON response for an unhealthy System component

{
  "status": "DOWN",
  "components": {
    "<component-name>": {
      "status": "DOWN"
    },
    "livenessState": {
      "status": "UP"
    },
    "readinessState": {
      "status": "UP"
    }
  },
  "groups": [ "liveness", "readiness" ]
}

Health Components

Health API components help you assess the health and status of various aspects of your HiveMQ deployment.

Info

The Info health component provides general information about the HiveMQ platform.
For example, the HiveMQ version, the current log level, and the epoch timestamp of the node start.

Example JSON response body for a healthy Info component

{
  "status": "UP",
  "details": {
    "cpuCount": 32,
    "logLevel": "INFO",
    "startedAt": 1693479102563,
    "version": "4.19.0"
  }
}

Cluster Service

The Cluster health component provides information about the connection state of your HiveMQ Cluster.

Once inter-broker communication is fully established, the node has successfully joined the cluster, and no leave replications are in progress, the cluster status reports UP. When the cluster is in an UP state, changes to the cluster topology are safe.
While a node leave replication is in progress, the cluster reports a DEGRADED health status. DEGRADED health status automatically sets the readiness status of the cluster to DEGRADED_SERVICE. To avoid possible data loss, do not change the cluster topology while the cluster is in a DEGRADED state.

The information the Cluster health component provides can help you debug node synchronization. For example, to detect a node that is stuck in the join process or to detect a network split.

Example JSON response for a healthy Cluster component

{
  "status": "UP",
  "details": {
    "clusterNodes": [
      "C3P0X",
      "R2D2Y"
    ],
    "clusterSize": 2,
    "isLeaveReplicationInProgress": false,
    "nodeId": "C3P0X",
    "nodeState": "RUNNING"
  }
}

Example JSON response for a degraded Cluster component

{
  "status": "DEGRADED",
  "details": {
    "clusterNodes": [
      "C3P0X",
      "R2D2Y"
    ],
    "clusterSize": 2,
    "isLeaveReplicationInProgress": true,
    "nodeId": "C3P0X",
    "nodeState": "RUNNING"
  }
}

Example JSON response for an unhealthy Cluster component

{
  "status": "DOWN",
  "details": {
    "clusterNodes": [
      "C3P0X",
      "R2D2Y"
    ],
    "clusterSize": 2,
    "isLeaveReplicationInProgress": false,
    "nodeId": "C3P0X",
    "nodeState": "JOINING"
  }
}

MQTT

The MQTT health component provides information about the MQTT listeners and their connection state.
The information the MQTT component provides is useful to ensure that all configured listeners are correctly started and ready to accept traffic.
Failure reasons are provided in the details of each listener component.
If a TLS-related failure occurs, the listener reports a DEGRADED health status. The failure also sets the readiness status to DEGRADED_SERVICE. In this state, the TLS connections on the listener can still work since the connection continues to use the previously loaded keystore and truststore. As long as the old certificates remain valid, the TLS listener can continue to function.
However, if the old certificates become invalid or the node is restarted, the listener can fail. Immediate action is recommended to ensure a stable operation of the HiveMQ cluster

Example JSON response for a healthy MQTT component

{
  "status": "UP",
  "components": {
    "tcp-listener-1883": {
      "status": "UP",
      "details": {
        "bindAddress": "0.0.0.0",
        "isProxyProtocolSupported": false,
        "isRunning": true,
        "port": 1883,
        "type": "TCP Listener"
      }
    },
    "tls-tcp-listener-8883": {
      "status": "UP",
      "details": {
        "bindAddress": "0.0.0.0",
        "isProxyProtocolSupported": false,
        "isRunning": true,
        "port": 8883,
        "type": "TCP Listener with TLS"
      }
    }
  }
}

Example JSON response for a degraded MQTT component

{
  "status": "DEGRADED",
  "components": {
    "tcp-listener-1883": {
      "status": "UP",
      "details": {
        "bindAddress": "0.0.0.0",
        "isProxyProtocolSupported": false,
        "isRunning": true,
        "port": 1883,
        "type": "TCP Listener"
      }
    },
    "tls-tcp-listener-8883": {
      "status": "DEGRADED",
      "details": {
        "bindAddress": "0.0.0.0",
        "isProxyProtocolSupported": false,
        "isRunning": true,
        "lastTlsFailure": "com.hivemq.security.exception.SslException: Not able to open or read KeyStore '/usr/lib/jvm/11/jre/lib/security/cacerts/keystore.jks' with type 'JKS'",
        "port": 8883,
        "type": "TCP Listener with TLS"
      }
    }
  }
}

Example JSON response for an unhealthy MQTT component

{
  "status": "DOWN",
  "components": {
    "tcp-listener-1883": {
      "status": "UP",
      "details": {
        "bindAddress": "0.0.0.0",
        "isProxyProtocolSupported": false,
        "isRunning": true,
        "port": 1883,
        "type": "TCP Listener"
      }
    },
    "tls-tcp-listener-8883": {
      "status": "DOWN",
      "details": {
        "bindAddress": "0.0.0.0",
        "isProxyProtocolSupported": false,
        "isRunning": false,
        "lastFailure": "java.io.IOException: Failed to bind to /0.0.0.0:8883",
        "port": 8883,
        "type": "TCP Listener with TLS"
      }
    }
  }
}

Control Center

The Control Center health component provides information about the Control Center. The details show the current Control Center configuration, the state of the Jetty server connector, and failure reasons in the details of each listener component (if applicable).

Example JSON response for a healthy Control Center component

{
  "status": "UP",
  "details": {
    "enabled": true,
    "defaultLoginMechanismEnabled": true,
    "maxSessionIdleTime": 14400
  },
  "components": {
    "control-center-http-listener-8080": {
      "status": "UP",
      "details": {
        "bindAddress": "0.0.0.0",
        "isConnectorFailed": false,
        "isConnectorOpen": true,
        "isConnectorRunning": true,
        "port": 8080
      }
    }
  }
}

Example JSON response for an unhealthy Control Center

{
  "status": "DOWN",
  "details": {
    "enabled": true,
    "defaultLoginMechanismEnabled": true,
    "maxSessionIdleTime": 14400
  },
  "components": {
    "control-center-https-listener-8443": {
      "status": "DOWN",
      "details": {
        "bindAddress": "0.0.0.0",
        "isConnectorFailed": false,
        "isConnectorOpen": true,
        "isConnectorRunning": true,
        "lastTlsFailure": "java.io.IOException: keystore password was incorrect",
        "port": 8443
      }
    }
  }
}

REST API

The REST API health component provides information about the HiveMQ REST API.
The details show your current REST API configuration, the state of the Jetty server connector, and failure reasons in the details of each listener component (if applicable).

Example JSON response for a healthy REST API component

{
  "status": "UP",
  "details": {
    "authenticationEnabled": false,
    "enabled": true
  },
  "components": {
    "http-listener": {
      "status": "UP",
      "details": {
        "bindAddress": "127.0.0.1",
        "isConnectorFailed": false,
        "isConnectorOpen": true,
        "isConnectorRunning": true,
        "port": 8888
      }
    },
    "https-listener": {
      "status": "UP",
      "details": {
        "bindAddress": "127.0.0.1",
        "isConnectorFailed": false,
        "isConnectorOpen": true,
        "isConnectorRunning": true,
        "port": 8443
      }
    }
  }
}

Example JSON response for an unhealthy REST API component

{
  "status": "DOWN",
  "details": {
    "authenticationEnabled": false,
    "enabled": true
  },
  "components": {
    "http-listener": {
      "status": "DOWN",
      "details": {
        "bindAddress": "127.0.0.1",
        "isConnectorFailed": true,
        "isConnectorOpen": false,
        "isConnectorRunning": false,
        "lastFailure": "java.io.IOException: Failed to bind to /0.0.0.0:8888",
        "port": 8888
      }
    },
    "https-listener": {
      "status": "UP",
      "details": {
        "bindAddress": "127.0.0.1",
        "isConnectorFailed": false,
        "isConnectorOpen": true,
        "isConnectorRunning": true,
        "port": 8443
      }
    }
  }
}

Extensions

The Extensions health component provides information about your configured custom and HiveMQ Enterprise extensions.

The details provide the extension metadata and failure information. For example, the reason string when an extension fails to start or the trial mode of a HiveMQ Enterprise Extension expires.

Example JSON response for a healthy Extension component

{
  "status": "UNKNOWN",
  "details": {
    "author": "HiveMQ",
    "enabled": true,
    "isEnterprise": false,
    "name": "Allow All Extension",
    "priority": 0,
    "startPriority": 1000,
    "startedAt": 1693479104646,
    "version": "1.0.0"
  }
}

Example JSON response for an unhealthy Extension component

{
  "status": "DOWN",
  "details": {
    "author": "HiveMQ",
    "enabled": false,
    "isEnterprise": true,
    "isTrial": true,
    "isTrialExpired": false,
    "lastStartupFailure": "Error in tracing configuration",
    "name": "HiveMQ Enterprise Distributed Tracing Extension",
    "priority": 1000,
    "startPriority": 1000,
    "startedAt": 1693585188163,
    "version": "4.19.0"
  }
}

Health Groups

It is sometimes useful to organize health components into groups that can be used for different purposes.
Health groups show the aggregated state of selected health components.
The HiveMQ liveness and readiness health groups are useful for common use cases such as liveness and readiness probes for Kubernetes containers.

Liveness Check

The liveness health group checks whether the deployed HiveMQ broker is currently operational, responsive, and reachable.

Example liveness check JSON response for a running and responsive HiveMQ broker

{ "status": "UP" }

Readiness Check

The Readiness health group checks whether the HiveMQ node is currently available to receive and process MQTT messages.

The Readiness health group aggregates the state of the Cluster and MQTT health components.

If the Cluster component and the MQTT component are both healthy, the readiness check returns the status UP and the node can accept traffic.

If one of the components in the Readiness health group is degraded, the readiness check returns the status DEGRADED_SERVICE.
The degraded service status indicates that the HiveMQ node is still operational, but might fail over time or on the next restart. Immediate action is recommended to ensure a stable operation of the HiveMQ cluster.

If one of the components in the Readiness health group is not healthy, the readiness check returns the status OUT_OF_SERVICE.
The out-of-service status indicates that the HiveMQ node is currently unable to accept traffic.

Example readiness check JSON response for a node that is ready to accept MQTT traffic

{
  "status": "UP",
  "components": {
    "cluster": {
      "status": "UP",
      "details": {
      }
    },
    "mqtt": {
      "status": "UP",
      "components": {
      }
    }
  }
}

Example readiness check JSON response for a node that is currently unable to accept MQTT traffic

{
  "status": "OUT_OF_SERVICE",
  "components": {
    "cluster": {
      "status": "UP",
      "details": {
      }
    },
    "mqtt": {
      "status": "DOWN",
      "components": {
      }
    }
  }
}