Leveraging Behavior Model on MQTT Communication to Optimize IoT Deployments
MQTT stands out as a highly versatile protocol designed for efficient, lightweight communication in IoT applications. Rooted in the publish-subscribe model, it inherently fosters a decoupled relationship between clients. Nevertheless, when working with MQTT, data producers and consumers often depend on a predefined set of guidelines. For instance, 1) a device comprising multiple MQTT connections must adhere to a specific initialization sequence to be recognized as online; 2) a predetermined sequence of messages is necessary to trigger the readiness of data producers for full functionality; and 3) proper resource usage caused by clients. In this article, a behavior model based on a finite state machine is introduced. At the end of this article, a real-world example is shown which can already be used today.
The Need for Behavior Policies on MQTT Communication
Since MQTT allows implementing any of these communication schemes flexibly, data producers usually implement some control logic to validate against these. However, this approach has some drawbacks since either data transfer happens unnecessarily, and the respective control logic has to be implemented in all data consumers. Even when data producers are implemented according to agreed standards, checks are required to validate these standards to overcome inadvertent faulty deployments. Applying behavior policies overcomes the need to check on the consumer side.
The Model
Regardless of the MQTT version, the MQTT protocol implements a defined state machine to handle client connections. For example, a CONNECT
packet is initially required to establish a client-to-broker connection. Once the connection has been established, a certain state is materialized and further packets can be sent. Consecutively, a client can now send a SUBSCRIBE
packet to subscribe to an MQTT topic. After that, a client sends a PUBLISH
packet to publish a payload to the broker. Eventually, the client sends a DISCONNECT
packet to disconnect from the broker — the final state is reached.
The following diagram shows the client behavior we just described. Just keep in mind that this demonstrates the progress over time when a client is connecting to a broker rather than reflecting the actual implementation.
A finite state machine can be briefly specified that consists of finite States and finite Transitions between two states. In the example above, the following states and transitions are shown:
States:
Connected
,Subscribed
,Published
,Start
,End
Transitions:
Start
->CONNECT
->Connected
Connected
->SUBSCRIBE
->Subscribed
Subscribed
->PUBLISH
->Published
Published
->DISCONNECT
->End
Note: The above states and transitions are simple examples of how the client is interacting with an MQTT broker. HiveMQ’s broker implementation is much more flexible and highly scalable, but for simplicity’s sake, the state machine illustrates a typical scenario.
Purpose-driven Protocol
Most use cases define their purpose-driven protocol on top of these states and transitions provided by the MQTT protocol, mostly on top of MQTT payloads with actual data. An example: A device is considered entirely online when the following MQTT packets are sent in the specific order: CONNECT
, SUBSCRIBE
, PUBLISH+
, DISCONNECT
whereas PUBLISH+
means at least one publish is being sent. The essence is that the SUBSCRIBE
packet must come before the PUBLISH
messages to consider this device adequately initialized.
To validate this kind of sequence, a specific protocol is defined that requires this particular sequence — a purpose-driven protocol is defined. Other scenarios: 1) the Last Will must be defined to be valid, and 2) no duplicate payloads must be sent within an hour to the purpose-driven protocol.
Generally, this client behavior can be validated using custom code implementation as an extra microservice or by implementing a HiveMQ Extension.
Behavior Models as Purpose-driven Protocol Checker
In this section, we introduce Behavior Models. A behavior model is a finite state machine that specifies the purpose-driven protocol on top of any MQTT packets. The state machine has the following properties:
States:
Initial state: The start state that a behavior model starts as soon as a client initiates the connection with the MQTT broker
Terminal state: The state has two subtypes:
Success
indicates that a client has passed the behavior model successfully andFailed
otherwise.Intermediate State: All other states model intermediate states between initial and terminal.
Transitions:
A certain event may cause a state transition. This includes at least MQTT packets such as
CONNECT
,PUBLISH
, and similar or more complex events such as time or even external triggers.A transition consists of a start state, a conditional event, and a target state.
A conditional event consists of an event such as an 1) MQTT packet (such as
CONNECT
orPUBLISH
), or an event and a condition that returnstrue
orfalse
.
Actions:
A transition may have additional behavior to execute, e.g., modifying the payloads being sent before reaching any consumer, or building custom metrics.
Memory:
It also has a limited memory to store and load data to and from, which has the operation
store(var, value)
andload(var)
, which returns the stored value.
Example: DUPLICATE_COUNTER
Since the formalism might be a bit too dry, let’s consider an example below: Define a behavior model called DUPLICATE_COUNTER
, which counts the number of two consecutive and identical payloads. This behavior model could be modeled as follows:
States:
The model has three states, called
Start
,CONNECTED
, andEnd
Start
is an initial state,Connected
is an intermediate state, andEnd
is a successful terminal state.
Transitions & Actions:
Transition from
Start
toConnected
: triggered once a client initiates a connection via the MQTT packetCONNECT
. While executing the transition, a variable namedcounter
is initialized to 0 and stored in the memory.Loop from
Connected
toConnected
: triggered by the conditional event once a client sends payloads via the MQTTPUBLISH
packet and the conditionisIdentical
istrue
. Consequently, the action is executed that increments the variablecounter
by one.Transition from
Connected
toEnd
: A client sends an MQTTDISCONNECT
packet, the transition is triggered, and the state machine results in thesuccessful
terminal state.
The counter increases every time two identical messages are published consecutively. Note, the function isIdentical
is not further introduced here for the sake of simplicity.
The formalism of behavior models allows us to check whether the usage of the MQTT protocol is well-defined. The models may have two outcomes: either the MQTT client implements the state machine correctly (the behavior models end up in a successful terminal state), or the MQTT client misbehaves (which ends up in the failed terminal state). Especially, misbehavior can be handled accordingly, e.g., to first make these clients visible in large-scale IoT deployments or to correct behavior by introducing additional logic in the state transitions. In particular, the latter case is interesting for fixing clients that are hard to fix due to the lack of updateability.
Implementation in Data Hub
With the HiveMQ Platform version 4.20, we made the HiveMQ Data Hub generally available, which implements, alongside data-policy, the new behavior policies with pre-defined Data Hub behavior models.
A behavior policy instantiates a behavior model checker for selectable client connections. Each MQTT packet received in the broker is passed to the checker to determine the state transitions of the instantiated behavior model. Moreover, state transitions can cause further actions. Even disconnecting a client or dropping a message is possible when a client misbehaves.
The general view of the architecture is shown below. MQTT packets from a connected client are processed by the broker and checked by the instantiated Behavior Model Checker, which we call the policy engine. The policy engine returns the further action to be taken, for example, drop the message, disconnect the client, or just log a message for further inspection.
As a consequence, each client connection has some further information in regards to the instantiated behavior model’s state and the variables — remember, we created a counter
variable in the example above.
However, an essential aspect is to make client behavior visible. For this purpose, the current state of each client connection can be requested via the REST API — even state variables persisted in the memory are returned for further debugging.
Publish.duplicate
Consider the following described behavior model that is an actual available behavior model in HiveMQ Data Hub and similar to the presented example model above. The intention of this behavior model is to track duplicate messages in practice with two terminal states.
States | Type | Description |
---|---|---|
Initial | Initial, Non-Terminal | The starting point of the model which is entered as soon as a client is matched by the policy |
Connected | Intermediate, Non-Terminal | The state models that a client has successfully connected to the broker |
NotDuplicated | Intermediate, Non-Terminal | Indicates that either the client has sent its first message or two consecutive messages are different. |
Duplicated | Intermediate, Non-Terminal | Indicates that the client has sent a message which is equal to the previous one. |
Violated | Failure, Terminal | When a client has sent two equal consecutive messages at any point in time and disconnects the state, Violated is the terminal state. |
Disconnected | Success, Terminal | When a client has always sent different consecutive messages and disconnects, the state Disconnected is the terminal state. |
Below, the state machine of the behavior model is shown.
As you can see in the state machine, there is one state indicating whether the client sends duplicate messages – the Duplicated
state.
The listing below shows the complete behavior policy ready to be used:
The policy instantiates the behavior model for each connecting client configured in the matching field. In this example, the Publish.duplicate
behavior model is used. Next, an additional action in the onTransitions
fields is defined. In this particular case, every time the MQTT client sends a PUBLISH
packet that leads the state machine to move from Any
(wildcard state) into the Duplicated
state, the pipeline is executed. In this case, the pre-defined and available Mqtt.drop
function is executed, which means the incoming MQTT PUBLISH
packet will be dropped. Eventually, depending on the version of the MQTT client, a reason string is provided to the sending client.
This behavior policy is ready to be used in your IoT deployments with HiveMQ Data Hub to avoid message duplicates on the consumer side.
You may also use the HiveMQ Control Center to create the policy above. Please see the screenshot below showing the relevant part of the behavior policy to create it in your HiveMQ Control Center.
Showcase
In this section, we want to demonstrate the semantic of clients with regard to having the behavior policy registered in HiveMQ Data Hub. The animated illustration below shows three parts of it:
The top-left console shows an MQTT client publishing data to the HiveMQ Broker using the mqtt cli. The client connects with the broker with the clientId “testclient”. Once a connection is established, messages are published to the topic “test” in the following order:
{ “temperature”: 123 }
{ “temperature”: 123 }
{ “temperature”: 124 }
{ “temperature”: 124 }
{ “temperature”: 123 }
The bottom-left console shows an MQTT client that subscribes to the wildcard topic to consume all data. As you can see in the output, the client only consumes distinct consecutive messages:
{ “temperature”: 123 }
{ “temperature”: 124 }
{ “temperature”: 123 }
The right console shows the requested state from the REST API for the client “
testclient
”. Every time the client has sent a duplicate message, the behavior model moves into theDuplicated
state. The response also shows further information.
Practical Implications
The behavior model Publish.duplicate
shows generally advantages in various dimensions as listed below:
Improves Efficiency: Reducing unnecessary data transmission enhances network efficiency, ensuring that only relevant, timely data is communicated.
Lowers Operational Costs: Less data transmission means lower bandwidth usage and potentially reduced costs associated with data storage and processing.
Enhances System Performance: Systems and networks are less burdened, leading to improved performance and reliability.
Facilitates Better Data Management: With more streamlined data flows, it becomes easier to manage, analyze, and leverage data effectively.
Supports Scalability: Efficient data transmission is crucial for scalability, especially in growing IoT networks where the number of devices and data points can exponentially increase.
Conclusion
Integrating behavior models into MQTT communication is essential for optimizing IoT deployments. The introduction of finite state machines ensures systematic validation of MQTT clients, enforcing predefined sequences and standards. The DUPLICATE_COUNTER behavior model exemplifies practical application, tracking conditions like duplicate message occurrences.
As the IoT landscape advances, leveraging behavior models in MQTT emerges as a proactive strategy for ensuring reliability, efficiency, and cost-effectiveness in large-scale IoT scenarios, marking a significant stride in achieving well-defined and optimized MQTT-based IoT deployments.
Stefan Frehse
Stefan Frehse is Senior Engineering Manager at HiveMQ. He earned a Ph.D. in Computer Science from the University of Bremen and has worked in software engineering and in c-level management positions for 10 years. He has written many academic papers and spoken on topics including formal verification of fault tolerant systems, debugging and synthesis of reversible logic.