Skip to content

Don't miss the unveiling of HiveMQ Pulse with Walker Reynolds! Join the webinar

Mitigating Downtime in Enterprise IoT When Writing to a Database

by Ryan Dussiaume
8 min read

Enterprise IoT applications rely on a steady flow of data from devices to back end systems, but infrastructure doesn't always cooperate. Hardware failures, software issues, network congestion, etc., can interrupt the data's path to long-term storage.

What Makes HiveMQ's Database Extensions So Reliable?

The HiveMQ Database Extensions were built with these challenges in mind. By leveraging the Quality of Service (QoS), persistent sessions and queuing features of the MQTT protocol, along with the replication and extension framework features of HiveMQ's masterless cluster architecture, these extensions preserve messages for later delivery when the database or a HiveMQ cluster node goes offline. 

This blog post describes the HiveMQ Broker’s features, its extension framework, the extensions themselves that make this possible, and how those features work together to prevent message loss when either the database or a HiveMQ cluster node becomes unavailable.

At the time of writing, the capabilities described in this blog post are available in the following database extensions:

Also note that these features are only used if the messages have been sent to the broker with a QoS level of at least 1. QoS 0 messages are not stored for later delivery, in keeping with the definition of QoS 0 being "at most once" delivery (aka, "fire and forget").

During normal operations, HiveMQ database extensions are processing messages that are typically published to the cluster by an IoT device. The extension framework distributes messages to the database extension instances on each cluster node for processing, ensuring that each message is only processed once. The extension instance then writes each of these messages to a database.

This process includes the following features in support of reliability and scalability:

Batching

The extension receives individual messages from the extension framework for processing, but it only stores these messages until it has a certain number of them or a certain amount of time has passed. The extension then processes all of the messages it has received at once. This enables the extension to be more efficient because it is doing a single database write for multiple messages, reducing the overall processing time.

In-Flight Messages

When the extension framework distributes a message to an extension for processing, it protects it from a failure in the extension by storing it as an "in-flight" message on the broker. In-flight messages are also copied to other nodes in the cluster in a process called replication, which protects these messages against node failure. When the database extension successfully writes messages to the database, it notifies the extension framework, which in turn triggers deletion of the in-flight messages on all nodes.

Queued Messages

If a message can't be processed immediately by one of the extension instances, it will be queued for later processing. Queued messages are also replicated in the cluster. In this manner, messages are protected against loss when the database extensions are overloaded, or when they are blocked from processing additional messages, such as when the database is unreachable. If a HiveMQ cluster node leaves the cluster, and the extension instance that was processing the message was on that node, the extension framework will send the in-flight messages for that extension instance back into the queue for eventual processing by an extension instance on another node.

Retry Mechanism

If a database extension can't write to the database because it has become unavailable, it will try again and again until it can re-establish the connection and write its current batch. During this time, the batched messages remain securely stored as in-flight messages on the broker because the extension will not notify the broker that it has processed the messages until the database write has occurred. 

HiveMQ cluster and database extension writing IoT device data to a databaseHiveMQ cluster and database extension writing IoT device data to a database

How HiveMQ Prevents Data Loss During Outage Scenarios

The following table summarizes how these features are used together to prevent message loss during downtime events.

Downtime Event Unreachable Database Unreachable Cluster Node
Potential Cause(s) Hardware failures, network congestion, network or security configuration issues, software configuration issues, software configuration issues, underprovisioned hardware, software errors, etc.Hardware failures, network congestion, network or security configuration issues, software configuration issues, software configuration issues, underprovisioned hardware, software errors, etc.
Actions (During downtime) 1. Retry mechanism starts on the extension1. Replicated in-flight messages for the current batch of the unreachable extension instance are sent to the queue
2. In-flight messages remain stored and replicated in the broker cluster2. Other cluster nodes process the queued messages if there is the capacity to do so
3. New messages are queued and the queue is replicated in the broker cluster3. If there isn't enough capacity, the queue will start to build up
Actions (After downtime) 1. Final retry succeeds1. Node returns to cluster
2. Extension notifies broker of success2. Starts processing either new or queued messages
3. In-flight messages are deleted3. Any queue buildup is drawn down in a first-in, first-out manner
4. Extension batches new messages received from the queue4. If the extension instances are processing the queue, new messages will be queued
5. Batched messages are stored and replicated as in-flight messages on the broker
6. Extension writes new batch to the database, etc.
Result No message loss*No message loss*

* Note that message loss can occur if the queue has insufficient storage capacity or reaches its configured limit of messages.

Conclusion

This blog post demonstrates how HiveMQ's database extensions are engineered to provide robust and reliable IoT data storage. By intelligently combining the strengths of the MQTT protocol's QoS, persistent sessions, and queuing with HiveMQ's masterless cluster architecture, these extensions actively prevent message loss even in the face of infrastructure challenges like unreachable databases or cluster nodes. Features such as batching, in-flight message protection, queued message handling, and a resilient retry mechanism work in concert to ensure that critical IoT data is always preserved and eventually delivered, making HiveMQ the most dependable solution for enterprise IoT applications.  Contact us to learn more about HiveMQ and the Enterprise Database Extensions.

Ryan Dussiaume

Ryan Dussiaume, a Solutions Engineer at HiveMQ, combines his software development expertise with a passion for staying at the forefront of technology. Proficient in IIoT, MQTT, and UNS, Ryan is dedicated to guiding companies on their Industry 4.0 transformation, leveraging his experience with middleware and cloud native technologies to benefit individuals, teams, and businesses.

  • Ryan Dussiaume on LinkedIn
HiveMQ logo
Review HiveMQ on G2