Skip to content

Exploring MQTT Sparkplug Session State Management in IIoT Communication

by Kudzai Manditereza
18 min read

In the last blog, Understanding MQTT Sparkplug Topic Namespace in IIoT Architectures, we explored how to overcome the challenges of disparate MQTT Topic formats in IIoT and how Sparkplug standardizes MQTT Topic Namespace. This blog post will help in understanding the core concepts of MQTT Sparkplug Session State Management and how it helps in IIoT.

In IIoT ecosystems, multiple devices and applications often work together to achieve a specific task or sequence of operations. The synchronization between the devices becomes possible only if there’s a real-time mechanism to track each device’s state or status. For example, if a sensor goes offline without a mechanism in place to notify other systems on the Industrial IoT (IIoT) network, some machines might continue operating under potentially damaging conditions, leading to wear or equipment failure.

By managing the state of devices, network participants can quickly determine if a device is active, if it’s waiting for a particular message, or if it has faced a disruption. This information is invaluable in IIoT systems, where state awareness across multiple devices is essential for coordinated operations.

In the MQTT protocol, there’s a concept of continuous session awareness. However, its native capabilities are not only underutilized, but they also fall short in addressing the nuanced requirements of IIoT applications. Let’s understand why.

How MQTT Sparkplug Enhances Session State Management in MQTT

MQTT provides a feature known as the “Last Will and Testament” (LWT), which is a message that’s sent by the MQTT broker if it detects that a client has disconnected unexpectedly. However, this mechanism can be insufficient in complex IIoT systems where more comprehensive metadata about the disconnection or device failure might be required. In addition, MQTT doesn’t have a built-in method for devices to announce their available metrics or to accept remote configurations.

One of the primary goals of MQTT Sparkplug specification is to harness MQTT’s inherent continuous session awareness feature for real-time SCADA/IIoT applications. Sparkplug enhances the LWT concept with what it calls a “death certificate.” This includes more granular information about the device’s last known state, allowing other devices in the network to take more informed actions in case of a disconnection.

Additionally, when Sparkplug-enabled devices connect to an MQTT network, they announce their available metrics with what is called a “birth certificate,” allowing other devices and applications to be aware of the kind of data they can request or expect. Moreover, Sparkplug also allows for remote configuration of devices, making it easier to adjust device settings in response to changing operational requirements.

As a result, Sparkplug introduces the concept of a stateful awareness for all nodes in the MQTT network. When a device (node) connects to an MQTT broker, it immediately receives the current state of all relevant devices, ensuring full state awareness. Let’s look at these concepts in detail.

Core Concepts of Sparkplug Session State Management

Birth Messages for Devices (DBIRTH) and Edge Nodes (NBIRTH)

Before diving into the specifics of DBIRTH and NBIRTH, it’s crucial to understand the broader concept of BIRTH messages in the Sparkplug ecosystem. BIRTH messages serve as announcements. When a device or node comes online and begins its operation in the network, it sends out a BIRTH message. This message communicates its status, capabilities, and metadata, ensuring that other devices or nodes in the network are aware of its presence.

DBIRTH: Device BIRTH Messages

Introduction of a Device: The primary role of the DBIRTH message is to signal the introduction of a new device within the Sparkplug network. It acts as the device’s initial handshake, letting the network know that the device is now online and operational.

Communicating Metrics and Properties: Alongside announcing the device’s operational status, DBIRTH messages also carry a payload that communicates the device’s metrics, properties, and configurations. This can include data like device type, version, operational parameters, and other crucial metadata.

Session Initialization: The sending of a DBIRTH message also indicates the beginning of a session for the device. It sets the stage for subsequent messages (DData) that the device will send during its operation, establishing a context for those messages.

Birth certificates in MQTT SparkplugBirth certificates in MQTT Sparkplug

NBIRTH: Edge Node BIRTH Messages

Announcing Edge Node Activation: Just as DBIRTH messages do for devices, NBIRTH messages announce an edge node’s activation. These are crucial in hierarchical structures where edge nodes might manage or communicate with multiple devices.

Edge Node Capabilities and Metadata: The NBIRTH message communicates not just the operational status but also the capabilities of the edge node. This can include information regarding the software it’s running, its version, configurations, and other metadata that might be essential for the devices or applications communicating with it.

Re-establishing State After Disconnection: If an edge node becomes disconnected and then reconnects, the NBIRTH message plays a crucial role in re-establishing its state within the network. By sending an NBIRTH message upon reconnection, it ensures that any lost state or configuration changes during the disconnection are broadcasted and synchronized across the network.

Death Messages for Devices (DDEATH) and Edge Nodes (NDEATH)

Among MQTT state management features introduced by Sparkplug, the concept of ‘DEATH’ messages, specifically for devices (DDEATH) and edge nodes (NDEATH), plays a crucial role in managing the lifecycle and session of these entities. Broadly, DEATH messages are the direct opposite of BIRTH messages. While BIRTH messages announce the online status and capabilities of devices or edge nodes, DEATH messages signal their unavailability or offline status.

DDEATH: Device DEATH Messages

Signaling Device Unavailability: At its core, the DDEATH message indicates that a device is no longer available within the Sparkplug network. This could be due to intentional shutdowns, malfunctions, or network disconnections.

Contextual Payload: The DDEATH message, while primarily indicating device unavailability, may also carry a payload. This payload can provide context about the reason for the device’s disconnection, such as error codes, last known metrics, or any other relevant metadata.

Session Termination: When a DDEATH message is sent, it effectively communicates the end of a device’s session. It helps ensure that other entities in the network don’t await data or communication from the now-unavailable device.

Death certificates in MQTT SparkplugDeath certificates in MQTT Sparkplug

NDEATH: Edge Node DEATH Messages

Announcing Edge Node Unavailability: Analogous to the DDEATH message for devices, the NDEATH message announces the offline status or unavailability of an edge node within the network.

Edge Node Metadata: Similar to the device counterpart, the NDEATH message can provide insights into the reason for the node’s disconnection. This might include data about the last known state, error messages, or other pertinent information.

Impact on Associated Devices: Given that edge nodes often manage or facilitate communication for multiple devices, an NDEATH message has broader implications. Devices associated with the node may become unreachable, and the NDEATH message serves as a precursor to potential cascading DDEATH messages or data loss.

The Interplay of DDEATH and NDEATH

In Sparkplug implementations, the relationship between DDEATH and NDEATH is deeply intertwined due to the hierarchical nature of edge nodes and devices. If an edge node sends an NDEATH message, it can be a harbinger for subsequent DDEATH messages from devices that relied on that node for communication. Understanding this interplay helps in anticipating potential data communication disruptions in the network.

How DATA Messages Complement BIRTH and DEATH Messages

While BIRTH and DEATH messages offer crucial insights into the lifecycle of devices and edge nodes, it’s the DATA messages that provide the continuous stream of operational data. Let’s look at how DATA messages complement the BIRTH and DEATH messages for continuous session awareness.

Setting the Context with BIRTH: The BIRTH messages (DBIRTH for devices and NBIRTH for edge nodes) introduce an entity to the network, signaling its operational readiness and detailing its capabilities. Once this introduction is made, DATA messages (DDATA and NDATA) take over, providing a consistent data stream. In essence, BIRTH messages set the stage and context, and DATA messages fill it with ongoing performance.

Ensuring Continuity: DATA messages ensure that there’s an uninterrupted flow of information from devices and edge nodes. While BIRTH messages might be sent only once during a device’s or node’s session, DATA messages are frequent, ensuring the network is continuously updated.

Predicting and Reacting to DEATH: An abrupt cessation or irregularity in DATA messages can often be a precursor to DEATH messages. For instance, if a device that regularly transmits DDATA messages suddenly stops, it might signal potential issues, prompting network administrators or automated systems to anticipate a forthcoming DDEATH message.

Completing the Communication Cycle: While BIRTH messages initiate and DEATH messages conclude a session, DATA messages are the heartbeat in between. Together, these messages ensure a complete, transparent, and efficient communication cycle for devices and edge nodes in the Sparkplug network.

Automatic Retransmission of BIRTH Messages Upon Reconnection

The Rebirth Concept: The term “rebirth” in the Sparkplug context aptly describes the process where a device or edge node, after a disconnection, announces its return to the network. This is achieved through the automatic retransmission of BIRTH messages.

Restoring Network Context: When a device or node gets disconnected, its last-known state might become outdated or irrelevant. Upon reconnection, it’s vital for the entity to communicate its current state and capabilities, ensuring that the network context is restored. The automatic transmission of BIRTH messages facilitates this, providing an updated snapshot of the entity’s status.

Handling Network Fluctuations: In real-world scenarios, devices and nodes might experience frequent network fluctuations. The rebirth mechanism ensures that every reconnection is treated with a fresh context, minimizing the risk of working with stale or outdated data.

Complementing DEATH Messages: If the network was informed of an entity’s disconnection through DEATH messages or the LWT feature, the subsequent BIRTH message upon reconnection complements this, signaling the restoration of the session. This cycle ensures that the network is always updated about the state of its entities.

Conclusion

In summary, by enhancing state awareness and facilitating coordinated operations in MQTT, Sparkplug improves the reliability of IIoT networks.

In Part 6 of this series, Breaking Down MQTT Sparkplug Payload Structures in IIoT Messaging, we’ll look at how devices can efficiently exchange information to bolster reliability and integrity.

If you are working on creating more efficient IIoT network communications to improve your operations and have questions about how MQTT Sparkplug can help, please contact us to discuss your project.

Additional Reading

Kudzai Manditereza

Kudzai is a tech influencer and electronic engineer based in Germany. As a Developer Advocate at HiveMQ, he helps developers and architects adopt MQTT and HiveMQ for their IIoT projects. Kudzai runs a popular YouTube channel focused on IIoT and Smart Manufacturing technologies and he has been recognized as one of the Top 100 global influencers talking about Industry 4.0 online.

  • Kudzai Manditereza on LinkedIn
  • Contact Kudzai Manditereza via e-mail

Related content:

HiveMQ logo
Review HiveMQ on G2