Skip to content

Managing IoT Device State Within MQTT

by Magnus McCune
14 min read

As the scale of IoT solutions grows, understanding the current state of client devices & services is not just a feature of these solutions but a necessity for creating efficiency and responsiveness. A common yet flawed approach to this recurring architectural challenge is to create external systems, some form of dedicated ‘device state service’, for storing and querying that information. While functional, this method introduces its own complexities and dependencies, making it less ideal. Fortunately, the flexible and open nature of the MQTT protocol and fully-compliant MQTT brokers like HiveMQ provides a reliable and scalable way to implement device state discovery — no external services required.

Identifying the Antipattern in IoT Device Connectivity

Let’s first consider the commonly applied solution of a Device State Service, an external function that might retain the current state of devices in a persistent database and make available an API for querying that state. While initially this seems like a reasonable and functional solution to a common challenge when working with distributed IoT solutions, it should be considered an antipattern.

Antipatterns are frequently used solutions to common architectural challenges that appear to be effective but can lead to negative consequences.

The overhead of maintaining separate systems, higher risk of data inconsistency, and increased latency — not to mention introducing a new dependency — can impede the effectiveness of IoT solutions. 

This solution might work well during development and testing but fail to scale with production workloads.

Proposing an MQTT-centric Solution

Where possible, we look to identify MQTT-native techniques to solve these recurring architectural challenges and document proven patterns that operate at scale. Implementing device state by leveraging the existing capabilities of MQTT assures us that the solution will scale along with our broker. 

It’s worth noting that specifications built on top of MQTT, such as Eclipse’s Sparkplug specification for Industrial IoT, have considerations built into them for addressing device state (see Sparkplug Session State Management). As these specifications are a collection of design & architectural choices defined on top of MQTT, it is possible to pick elements that we need for other projects without having to adopt the complete specification. HiveMQ has an excellent whitepaper on building a specification on top of MQTT that highlights this approach.

A simple, yet highly effective and scalable approach to implementing device state in MQTT is to use a device-specific topic, retained messages, and the Last Will and Testament feature of MQTT. When a device connects to the broker, it includes a Will, to be published by the broker to a device-specific topic, with a payload indicating an "offline" status, in the event of an ungraceful disconnect. 

After connecting, the device publishes a retained message to that same device-specific topic, with a payload that indicates an "online" status. The message indicating the "online" status is retained by the broker and pushed to any subscribers of that topic, including those that may subscribe in the future (Diagram 1). If the device were to disconnect ungracefully, the broker would then publish the Will message to the device-specific topic, superseding the previously retained message, and pushing it to any subscribers (Diagram 2).

Managing IoT Device State Within MQTT

A Scalable Solution to Seamless IoT Device State Management

Let’s take a more detailed look at how this solution works by investigating each component.

Device-Specific Topics & Wildcard Subscriptions

For a solution like this to work, and as a general best practice, it is recommended to design topic structures that include the device’s unique identifier at some level of the topic hierarchy. This enables many use cases and design patterns that are critical to building a scalable IoT solution. Whether state management, as this blog post covers, or granular authorizations, device commands, or Over-The-Air firmware updates etc., having a way to uniquely identify a device by the topic hierarchy is valuable. Examples might include using the Vehicle Identification Number in the topic structure of a connected car solution (cc/v1/uniqueVIN/state) or the Edge Node Descriptor of Sparkplug in IIoT settings, which is a combination of the Group ID and the EoN Node ID (spBv1.0/groupID/NBIRTH/eonNodeID).

To increase the effectiveness of this technique, wildcard subscriptions can be used if a given service needs to be aware of the current state of all devices in the hierarchy. In our earlier connected car example, a subscriber to the topic cc/v1/+/state would have the state changes of ALL vehicles pushed to it. 

MQTT Retained Messages

In order to ensure that the device’s state is available to new subscribers, even if they weren’t subscribed at the time of the change in state, we use retained messages in conjunction with our device-specific topics. The MQTT protocol only allows for one message to be retained per topic, which ensures that only the most recently updated state is held. Any clients that subscribe to that device’s topic, or a wildcard topic that contains it — even after the initial message has been published — will receive the retained message. The device itself is responsible for publishing its “online” status after it makes a successful connection while the “offline” status is handled by the broker, through the Will mechanism.

MQTT Will Message

To update the status of a device that has been disconnected, perhaps due to network failure, broker action, or other ungraceful disconnection, we rely on the Will mechanism of MQTT, sometimes also known as Last Will and Testament (LWT). As part of the initial CONNECT message, a device can include the Will flag, the topic for the Will message, its QoS level, whether it should be retained, and the payload. In the event of an ungraceful disconnection, the broker must publish the Will messages to the defined topic, which is then pushed to all subscribers. The published and retained Will message and payload indicating the “offline” status of the device then supersedes any previous state. If the device reconnects, it publishes its “online” status, which in turn supersedes the Will message. 

Further Enhancements & Advanced Use Cases

Thus far, examples have used a simplistic “online” or “offline” status as the payload for the state messages. However, these could be enhanced to implement additional functionality. A mechanism to include a timestamp with the Will payload could provide some useful context. Similarly, adding a reason code for the ungraceful disconnection could help in troubleshooting. Adding a mechanism for the device to publish a graceful disconnect message prior to sending its DISCONNECT notification would distinguish between planned and unplanned disconnections. Each of these would require some client-side code or a broker-side extension to implement, as we have done with the HiveMQ Sparkplug Aware extension.  

An enhanced "offline" status message might look like this:

{
  “status”: “offline”,
  “reason”: “client DISCONNECT”,
  “timestamp” : 1704388031
}

Similarly, the “online” status payload could include additional information about the device, such as a firmware version, model number or even a payload schema. The SparkplugB specification is a fantastic example of these features being implemented in a robust and scalable manner. You can read more about how they are implemented with Sparkplug here:

HiveMQ Sparkplug Essentials - Session State Management  

HiveMQ Sparkplug Essentials - Payload Structures  

HiveMQ Sparkplug Essentials - Operational Behavior  

Another advanced use case might be a circumstance where multiple HiveMQ clusters are in use and the state of devices must be replicated across broker clusters. With the described MQTT native solution, this replication is possible without additional complication, thanks to offerings like HiveMQ Enterprise Bridge Extension. Publishing an identifier for the connected cluster as part of the “online” status payload could further enhance this use case.

Conclusion

Adopting an MQTT-centric approach for managing device states offers a more integrated, near-real-time, and scalable solution compared to external systems. This strategy not only simplifies the architecture but also enhances the reliability and responsiveness of IoT systems. We encourage IoT developers and architects to explore this approach in their solutions, and we welcome any feedback or questions on this topic.

Magnus McCune

Magnus is a Senior IoT Solutions Architect on the Professional Services team at HiveMQ. He is a passionate technologist with a proven background solving complex business and technical challenges through the design, implementation and operationalization of cloud and edge technologies. His expertise extends to network, cloud, & infrastructure architecture, cloud-native solutions design and large-scale automation projects.

  • Magnus McCune on LinkedIn

How HiveMQ Optimizes High-volume Data Ingest into AWS

A solution architect’s guide showing how HiveMQ MQTT platform can simplify the IoT solution architecture for telemetry data transfer to the AWS cloud.

Blog

Cracking MQTT Performance with Automation: Benchmarking Implemented

Learn how HiveMQ engineers implemented automated system benchmarks to improve performance testing of the MQTT broker.

Blog

Cracking MQTT Performance with Automation: Challenges and Approaches

Explore how HiveMQ engineers addressed the challenges related to MQTT performance and how they leveraged automated system benchmarking.

Blog

Implementing Authentication in HiveMQ Without Active Directory Schema Changes

A step-by-step guide to implement access control management and authentication inside of HiveMQ Broker without active directory schema changes.

Blog

HiveMQ Health API: MQTT Platform Monitoring Made Easy

Explore how SREs can monitor, identify, & resolve issues efficiently with HiveMQ Health API, which offers detailed health insights for the HiveMQ Platform.

Blog

Navigating the HiveMQ Migration: Your FAQ Guide

Get answers to some of the frequently asked questions about the migration process at HiveMQ.

Blog

Stopping the Scam: Anomaly Detection and Fraud Prevention with MQTT

Learn how MQTT & HiveMQ platform help provide deeper insights into IoT/IIoT data, detect anomalies as they occur, & safeguard against fraudulent activities.

Blog

Set Up to Scale: Trax Retail Enhances Customer Experience with Real-Time Inventory Management

Learn why Trax Retail implemented MQTT and HiveMQ to make data-driven decisions, optimize stocking processes, and enhance the overall customer experience.

Blog

Embracing Innovation With the New Long-Term Support Version of the HiveMQ Platform

Explore how long-term support version of HiveMQ 4.28 features improved user experience, better integration, enhanced reliability, improved security, & more.

Blog

HiveMQ Increases MQTT Per-Core Throughput with the New Client Queue

Learn how HiveMQ 4.28 release features increase in efficiency, making HiveMQ one of the most scalable and reliable MQTT platform.

Blog

HiveMQ Receives 2024 IoT Evolution Industrial IoT Product of the Year Award

HiveMQ recognized with 2024 IoT Evolution Industrial IoT Product of the Year award for its ability to connect millions of devices reliably and securely.

Blog

Understanding HiveMQ’s ISO/IEC 27001 Certification for Information Security Management

Explore why & how HiveMQ adopted ISO/IEC 27001 information security management standard to protect data, intellectual property, & consumer information.

Blog

Monitoring HiveMQ: A Comprehensive Guide

A comprehensive guide to monitoring HiveMQ MQTT Broker for several KPIs after it is deployed in a new production environment.

Blog

Real-World Wastewater Industry Use Cases Powered by HiveMQ

Learn how HiveMQ MQTT platform powered Unified Namespace (UNS) implementation is enabling IIoT use cases in the wastewater industry.

Blog

HiveMQ Configuration with AI: A Practical Approach

Explore how you can use OpenAI’s ChatGPT to customize a GPT to generate Dockerfiles and XML to deploy a secure HiveMQ MQTT Broker.

Blog

HiveMQ: High Availability Through Replication and Failover

Gain insights into how HiveMQ MQTT broker uses data partition techniques to provide continuous availability even when the master node is not available.

Blog

Smart Cities and Public Safety Made Possible with MQTT and HiveMQ

Explore how MQTT protocol and HiveMQ MQTT platform together can help in creating smart cities and enable public safety.

Blog

Upscaling with MQTT Workloads Through Data Partitioning

Learn how HiveMQ broker can scale MQTT workloads indefinitely using data partitioning.

Blog

Introducing Flexible MQTT Platform Upgrades with HiveMQ

Explore HiveMQ’s enhanced rolling upgrade policy, which provides you with a hassle-free experience while upgrading to the latest features of the HiveMQ platform.

Blog

Boosting MQTT Broker Efficiency with Improved Threading

Learn how HiveMQ boosts its MQTT broker efficiency with improved threading.

Blog

Real-time Analytics of MQTT Messages Using Elasticsearch, Kibana & HiveMQ

Learn to harness the power of Elasticsearch, Filebeat, Kibana, and HiveMQ for real-time analytics on MQTT messages. This blog covers installation, configuration, and deployment, allowing you to efficiently manage and analyze MQTT traffic.

Blog

Customer Onboarding at HiveMQ: A Path to Transforming Businesses

Learn how HiveMQ's customer onboarding process helps customers unlock the full potential of our MQTT platform, align strategies, and ensure success with our expert-guided onboarding sessions.

Blog

Integrating ESP32 with LoRaWAN and HiveMQ MQTT Broker for Advanced IoT

A hands-on developer tutorial on how to integrate ESP32 with LoRaWAN and HiveMQ MQTT Broker for advanced IoT applications and solutions.

Blog

Masterless Clustering of MQTT Broker for Business Continuity

Discover HiveMQ MQTT platform’s masterless clustering architecture, which provides business continuity for mission-critical applications across indusries.

Blog

Hands-on Guide to LoRaWAN and HiveMQ MQTT Broker Integration for IoT

A technical hands-on guide to integrating HiveMQ MQTT broker with ChirpStack open source LoRaWAN network server for IoT applications.

Blog

How HiveMQ Support Delivers Exceptional Customer Experience

Explore how HiveMQ achieved a 97% CSAT and uncover best practices and principles that help the support team deliver an exceptional customer experience.

Blog

HiveMQ Now Delivers 80% Higher MQTT Throughput

Discover how HiveMQ, starting from the 4.18 release, achieves a remarkable 80% increase in MQTT throughput, resulting in substantial cost savings.

Blog

Achieving 200 Million Concurrent Connections with HiveMQ

A technical whitepaper for IoT and IIoT showcasing the capability of the HiveMQ MQTT broker, which can scale up to 200 million concurrent connections.

Resource

2023 Buyer’s Guide: MQTT Platforms

Looking for the right MQTT Broker for your IoT or IIoT project? Here’s a guide on how to choose one for reliable, scalable, & secure data movement.

Resource

Connecting Legacy to AI: A Tutorial on HiveMQ Edge Brokers

Learn how to set up the HiveMQ Edge broker to acquire data from legacy sources and move it upstream.

Blog
HiveMQ logo
Review HiveMQ on G2