Shared Subscriptions, which were a unique feature of HiveMQ but became a standard with MQTT 5, allow MQTT clients to share the same subscription on the broker. When using "standard" MQTT subscriptions, each client receives a copy of the message. If shared subscriptions are used, all clients which share the same subscription will receive messages in an alternating fashion. This mechanism is sometimes called "client load balancing", we’re sticking to the Shared Subscription terminology in this user guide, though.
Clients can subscribe to a shared subscription with standard MQTT mechanisms. The topic structure for shared subscriptions is the following:
The shared subscription consists of 3 parts:
A static shared subscription identifier (“$share”)
A group identifier
The concrete topic subscriptions (may include wildcards)
A concrete example for such a subscriber would be $share/my-shared-subscribers/myhome/groundfloor/+/temperature.
Another example for such an subscriber with different syntax would be $share/my-shared-subscribers/myhome/groundfloor/+/temperature.
It’s important to understand that only one subscriber per group identifier will receive the message. So if multiple MQTT clients, sharing an identical group identifier, subscribe to the same topic, HiveMQ will distribute the message among them in an alternating fashion.
HiveMQ supports a shared subscription syntax that separates the shared subscription segments with a slash ( / ).
There are many use cases for shared subscriptions which excel most in high-scalability scenarios. Among the most popular use cases are:
Client Load Balancing for MQTT clients which can’t handle the load on subscribed topics on their own.
Worker (backend) applications which ingest MQTT streams and need to be scaled horizontally.
Intra-cluster node traffic should be relieved by optimizing subscriber node-locality for incoming publishes.
QoS 1 and 2 are used for their delivery semantics but Ordered Topic guarantees are not needed.
There are hot topics with higher message rate than other topics in the system and these topics become the scalability bottleneck.
With standard publish / subscribe mechanisms every subscriber gets its own copy of every message which matches the subscribed topic. When using shared subscriptions, each subscription group, which can be conceptually imagined as a virtual client, acts as proxy for multiple real subscribers at once. HiveMQ then selects one subscriber of the group and delivers the message. The following picture demonstrates the principle:
There can be an arbitrary number of Shared Subscription Groups in a HiveMQ deployment. So for example the following scenario would be possible:
In this example there are two different groups with 2 subscribing clients in each shared subscription group. Both groups have the same subscription but have different group identifiers. When a publisher sends a message with a matching topic, one (and only one) client of each group receives the message.
|It’s possible that different clients have different subscriptions for the same group identifier. In this case HiveMQ filters by matching subscribers per group and then distributes the message to one of the found clients. While technically possible, this can cause lots of confusion in understanding the message flow in your system. Our recommendation is to go with identical client subscriptions per shared subscription group.|
In HiveMQ Single Node Deployments, the distribution mode of messages for a shared subscription group is round-robin. This guarantees that the load is distributed evenly across all active subscribers in the same shared subscription group.
Shared Subscriptions are designed to relieve cluster traffic and latency dramatically for high scalability deployments. In fact, Shared Subscriptions are the recommended way to connect horizontally scaling backend systems with HiveMQ if the backend systems need to ingest data via MQTT.
While Single Node deployments guarantee a round-robin behaviour for messages, these guarantees are not in place for cluster deployments. In HiveMQ cluster deployments, messages are distributed via probabilistic algorithm. If a PUBLISH is received on a specific node and one or more shared subscribers are available on the same node, these local shared subscribers have a higher probability of receiving the message. A small percentage of messages will still hit other cluster nodes and these cluster nodes distribute the message among their shared subscribers.
It’s worth noting, that no round-robin algorithm is used for distributing (even on the same node), the messages will be distributed randomly.
|It is currently unfeasible to guarantee a QoS 2 in shared subscriptions, as assumptions about client state would be required. Shared subscriptions with QoS 2 will be downgraded to QoS 1.|
It’s highly recommended that all shared subscribers for a group subscribe with the same Quality of Service level to avoid complex situations which are hard to debug.
Members in a shared subscription group can subscribe with different QoS levels, though. When a client is selected by the Shared Subscription algorithm, the QoS level will be evaluated and the message will be sent with the correct QoS level.
|If possible, always use the same QoS level for a shared subscription group.|