Skip to content

Managing IoT Device State Within MQTT

by Magnus McCune
14 min read

As the scale of IoT solutions grows, understanding the current state of client devices & services is not just a feature of these solutions but a necessity for creating efficiency and responsiveness. A common yet flawed approach to this recurring architectural challenge is to create external systems, some form of dedicated ‘device state service’, for storing and querying that information. While functional, this method introduces its own complexities and dependencies, making it less ideal. Fortunately, the flexible and open nature of the MQTT protocol and fully-compliant MQTT brokers like HiveMQ provides a reliable and scalable way to implement device state discovery — no external services required.

Identifying the Antipattern in IoT Device Connectivity

Let’s first consider the commonly applied solution of a Device State Service, an external function that might retain the current state of devices in a persistent database and make available an API for querying that state. While initially this seems like a reasonable and functional solution to a common challenge when working with distributed IoT solutions, it should be considered an antipattern.

Antipatterns are frequently used solutions to common architectural challenges that appear to be effective but can lead to negative consequences.

The overhead of maintaining separate systems, higher risk of data inconsistency, and increased latency — not to mention introducing a new dependency — can impede the effectiveness of IoT solutions. 

This solution might work well during development and testing but fail to scale with production workloads.

Proposing an MQTT-centric Solution

Where possible, we look to identify MQTT-native techniques to solve these recurring architectural challenges and document proven patterns that operate at scale. Implementing device state by leveraging the existing capabilities of MQTT assures us that the solution will scale along with our broker. 

It’s worth noting that specifications built on top of MQTT, such as Eclipse’s Sparkplug specification for Industrial IoT, have considerations built into them for addressing device state (see Sparkplug Session State Management). As these specifications are a collection of design & architectural choices defined on top of MQTT, it is possible to pick elements that we need for other projects without having to adopt the complete specification. HiveMQ has an excellent whitepaper on building a specification on top of MQTT that highlights this approach.

A simple, yet highly effective and scalable approach to implementing device state in MQTT is to use a device-specific topic, retained messages, and the Last Will and Testament feature of MQTT. When a device connects to the broker, it includes a Will, to be published by the broker to a device-specific topic, with a payload indicating an "offline" status, in the event of an ungraceful disconnect. 

After connecting, the device publishes a retained message to that same device-specific topic, with a payload that indicates an "online" status. The message indicating the "online" status is retained by the broker and pushed to any subscribers of that topic, including those that may subscribe in the future (Diagram 1). If the device were to disconnect ungracefully, the broker would then publish the Will message to the device-specific topic, superseding the previously retained message, and pushing it to any subscribers (Diagram 2).

Managing IoT Device State Within MQTT

A Scalable Solution to Seamless IoT Device State Management

Let’s take a more detailed look at how this solution works by investigating each component.

Device-Specific Topics & Wildcard Subscriptions

For a solution like this to work, and as a general best practice, it is recommended to design topic structures that include the device’s unique identifier at some level of the topic hierarchy. This enables many use cases and design patterns that are critical to building a scalable IoT solution. Whether state management, as this blog post covers, or granular authorizations, device commands, or Over-The-Air firmware updates etc., having a way to uniquely identify a device by the topic hierarchy is valuable. Examples might include using the Vehicle Identification Number in the topic structure of a connected car solution (cc/v1/uniqueVIN/state) or the Edge Node Descriptor of Sparkplug in IIoT settings, which is a combination of the Group ID and the EoN Node ID (spBv1.0/groupID/NBIRTH/eonNodeID).

To increase the effectiveness of this technique, wildcard subscriptions can be used if a given service needs to be aware of the current state of all devices in the hierarchy. In our earlier connected car example, a subscriber to the topic cc/v1/+/state would have the state changes of ALL vehicles pushed to it. 

MQTT Retained Messages

In order to ensure that the device’s state is available to new subscribers, even if they weren’t subscribed at the time of the change in state, we use retained messages in conjunction with our device-specific topics. The MQTT protocol only allows for one message to be retained per topic, which ensures that only the most recently updated state is held. Any clients that subscribe to that device’s topic, or a wildcard topic that contains it — even after the initial message has been published — will receive the retained message. The device itself is responsible for publishing its “online” status after it makes a successful connection while the “offline” status is handled by the broker, through the Will mechanism.

MQTT Will Message

To update the status of a device that has been disconnected, perhaps due to network failure, broker action, or other ungraceful disconnection, we rely on the Will mechanism of MQTT, sometimes also known as Last Will and Testament (LWT). As part of the initial CONNECT message, a device can include the Will flag, the topic for the Will message, its QoS level, whether it should be retained, and the payload. In the event of an ungraceful disconnection, the broker must publish the Will messages to the defined topic, which is then pushed to all subscribers. The published and retained Will message and payload indicating the “offline” status of the device then supersedes any previous state. If the device reconnects, it publishes its “online” status, which in turn supersedes the Will message. 

Further Enhancements & Advanced Use Cases

Thus far, examples have used a simplistic “online” or “offline” status as the payload for the state messages. However, these could be enhanced to implement additional functionality. A mechanism to include a timestamp with the Will payload could provide some useful context. Similarly, adding a reason code for the ungraceful disconnection could help in troubleshooting. Adding a mechanism for the device to publish a graceful disconnect message prior to sending its DISCONNECT notification would distinguish between planned and unplanned disconnections. Each of these would require some client-side code or a broker-side extension to implement, as we have done with the HiveMQ Sparkplug Aware extension.  

An enhanced "offline" status message might look like this:

{
  “status”: “offline”,
  “reason”: “client DISCONNECT”,
  “timestamp” : 1704388031
}

Similarly, the “online” status payload could include additional information about the device, such as a firmware version, model number or even a payload schema. The SparkplugB specification is a fantastic example of these features being implemented in a robust and scalable manner. You can read more about how they are implemented with Sparkplug here:

HiveMQ Sparkplug Essentials - Session State Management  

HiveMQ Sparkplug Essentials - Payload Structures  

HiveMQ Sparkplug Essentials - Operational Behavior  

Another advanced use case might be a circumstance where multiple HiveMQ clusters are in use and the state of devices must be replicated across broker clusters. With the described MQTT native solution, this replication is possible without additional complication, thanks to offerings like HiveMQ Enterprise Bridge Extension. Publishing an identifier for the connected cluster as part of the “online” status payload could further enhance this use case.

Conclusion

Adopting an MQTT-centric approach for managing device states offers a more integrated, near-real-time, and scalable solution compared to external systems. This strategy not only simplifies the architecture but also enhances the reliability and responsiveness of IoT systems. We encourage IoT developers and architects to explore this approach in their solutions, and we welcome any feedback or questions on this topic.

Magnus McCune

Magnus is a Senior IoT Solutions Architect on the Professional Services team at HiveMQ. He is a passionate technologist with a proven background solving complex business and technical challenges through the design, implementation and operationalization of cloud and edge technologies. His expertise extends to network, cloud, & infrastructure architecture, cloud-native solutions design and large-scale automation projects.

  • Magnus McCune on LinkedIn

Related content:

Smart Cities and Public Safety Made Possible with MQTT and HiveMQ

Explore how MQTT protocol and HiveMQ MQTT platform together can help in creating smart cities and enable public safety.

Blog

Upscaling with MQTT Workloads Through Data Partitioning

Learn how HiveMQ broker can scale MQTT workloads indefinitely using data partitioning.

Blog

Introducing Flexible MQTT Platform Upgrades with HiveMQ

Explore HiveMQ’s enhanced rolling upgrade policy, which provides you with a hassle-free experience while upgrading to the latest features of the HiveMQ platform.

Blog

Boosting MQTT Broker Efficiency with Improved Threading

Learn how HiveMQ boosts its MQTT broker efficiency with improved threading.

Blog

Real-time Analytics of MQTT Messages Using Elasticsearch, Kibana & HiveMQ

Learn to harness the power of Elasticsearch, Filebeat, Kibana, and HiveMQ for real-time analytics on MQTT messages. This blog covers installation, configuration, and deployment, allowing you to efficiently manage and analyze MQTT traffic.

Blog

Customer Onboarding at HiveMQ: A Path to Transforming Businesses

Learn how HiveMQ's customer onboarding process helps customers unlock the full potential of our MQTT platform, align strategies, and ensure success with our expert-guided onboarding sessions.

Blog

Integrating ESP32 with LoRaWAN and HiveMQ MQTT Broker for Advanced IoT

A hands-on developer tutorial on how to integrate ESP32 with LoRaWAN and HiveMQ MQTT Broker for advanced IoT applications and solutions.

Blog

Masterless Clustering of MQTT Broker for Business Continuity

Discover HiveMQ MQTT platform’s masterless clustering architecture, which provides business continuity for mission-critical applications across indusries.

Blog

Hands-on Guide to LoRaWAN and HiveMQ MQTT Broker Integration for IoT

A technical hands-on guide to integrating HiveMQ MQTT broker with ChirpStack open source LoRaWAN network server for IoT applications.

Blog

How HiveMQ Support Delivers Exceptional Customer Experience

Explore how HiveMQ achieved a 97% CSAT and uncover best practices and principles that help the support team deliver an exceptional customer experience.

Blog

Achieving 200 Million Concurrent Connections with HiveMQ

A technical whitepaper for IoT and IIoT showcasing the capability of the HiveMQ MQTT broker, which can scale up to 200 million concurrent connections.

Resource

2023 Buyer’s Guide: MQTT Platforms

Looking for the right MQTT Broker for your IoT or IIoT project? Here’s a guide on how to choose one for reliable, scalable, & secure data movement.

Resource

How HiveMQ Can Help You Scale to 200 Million Connections

HiveMQ webinar showcasing the scalability of HiveMQ to 200 Million devices.

Webinar

Ask Me Anything: MQTT Experts Answer Your Questions | August 2022 Edition

The August 2022 edition of AMA session answers your most pressing questions around MQTT, MQTT security, MQTT Sparkplug, MQTT on cloud platforms and IoT architecture.

Webinar

How to Setup, Run, and Scale a Secure MQTT Broker on Kubernetes

How HiveMQ can be deployed into any Kubernetes cluster - managed in the public cloud or hand-crafted in your on-premise environment.

Webinar

Simplified IoT Operations With HiveMQ and Datadog

We will demonstrate how Datadog can easily be used to monitor the activity of an IoT system based on HiveMQ.

Webinar

The Four Paradigm Shifts for the Connected Car of the Future

A webinar discussing how MQTT, Kafka, domain modeling, & cloud computing accelerate time to market for new automotive features & improve customer experience.

Webinar

Machine to Machine Communication With Microsoft Azure IoT Edge and HiveMQ

A webinar discussing how to deploy the HiveMQ broker on Microsoft Azure IoT Edge and then how to use HiveMQ to send MQTT messages to the cloud.

Webinar

IoT in Production

A webinar discussing best practices for running an IoT solution in production, how to test a IoT system, & how can you troubleshoot a misbehaving device.

Webinar
HiveMQ logo
Review HiveMQ on G2