Skip to content

What's your UNS maturity level? Get a custom report: Take the UNS Maturity Assessment

Building a Robust MQTT Architecture and UNS for Scalability

by HiveMQ Team
29 min read

In today's fast-paced industrial environments, building a robust MQTT architecture is crucial for ensuring seamless communication across multiple sites. MQTT offers a lightweight and efficient messaging protocol that is vital for handling the extensive data flows typical of large-scale operations. 

HiveMQ Community team hosted an event titled CONNACK, where Jean-Romain Bardet, Co-Founder at Scorp-io, gave a talk on ‘Exploring Real-World MQTT-Based IIoT Architectures.’ In this talk, Jean touched upon challenges and solutions involved in scaling MQTT architecture to dozens of industrial sites, drawing on real-world applications using Unified Namespace (UNS) and Kafka, and practical experiences to illustrate key strategies for success.

UNS is like asking your kids to clean their room. Some kids will just throw all the toys into a big box, but some will sort each type of toy into its own small box so they can find them faster later. UNS is exactly the same. You can organize your data however you want, but you need to think about how you'll need to access it faster. So, clean your data based on your use case and the needs of the end-user. Otherwise, it's like cleaning someone else's room without knowing how they want to get their toys.

Jean-Romain Bardet, Co-Founder at Scorp-io
Jean-Romain Bardet Co-Founder at Scorp-io

Exploring Real-World MQTT-Based IIoT Architectures

Transcript of the Video

Jean Romain Bardet: Hi guys. I'm Jean Romain from Scorp-io. I'm a CEO and co-founder of Scorp-io and what we do in Scorp-io is pretty much we do monitoring and control in the cloud. So, today, I'm going to talk about building a robust MQTT architecture to scale across dozens of sites. We have taken a real use case from last year and we used our own software, which is based on MQTT.

So, I gonna try to make you understand what we did and what main issue we faced building this kind of architecture, because of course, our software is not across a dozen of sites, but it's across hundreds of sites. So we had a lot of issues at scale. So, I will try to give you tips as much as I can. Let's go. So, I am going to talk about a Belgian company, which is one of our main customers at the moment. They produce lime and for people who don't know about lime – producing lime is quite simple, to be honest.

Producing lime is the result of heating limestone in a big oven like this one. It's quite simple to produce, but you need a lot of energy to do it. So they face many issues. They have hundreds of sites across 25 countries. They have a small site with hundreds of data and a big site with thousands of data. They have multiple PSE as you may know and they have multiple on-premise software as well as different machines. They have a lot of stuff. This kind of machine produces data, which uses a lot of different protocols. MQTT, for example, but they also use Modbus, OPC UA, OPC DA, LoRaWAN, and other protocols they installed in the last decade.

They also use other software like ERP but it's complicated to get data from, and they mostly use REST API to get the data from. So the customer came with a simple demand. We have 5,000 people in our company and we want to bring the right data to the right person at the right moment. That was a challenge. That was the demand from them. The main issue they faced was they had this automation team, which spent like 50 percent of their time downloading data from the site and sharing them with their colleagues. So it was like 50 percent of wasted time getting the data and putting them in the hands of the right guy.

So that was the main demand. It was to bring the data to the right person at the right moment. And to be honest, it's quite challenging because you have to face multiple challenges. The first one is the result of the architecture. It's not the architecture itself, of course, but it's getting a smooth user experience to the user interface at the end of a project, because you can have the best architecture possible. If in the end the user experience is poor, the software won't be used. That's it. It's over. 

So, for me, it's not the architecture itself, but more about user experience. And of course, if you have the best architecture possible, you will also have the best user experience possible. So that was the first challenge.

The second challenge was also about the real-time data. Vincent talked about it, but you know, the customers have to deal with real-time data and not lose any data. That's why you need a robust architecture. You don't want to lose any data at scale. I mean, for one edge device to the cloud, it's quite simple, but when you have hundreds of sites, it's more complicated. You have to deal with store and forward, you have to deal with scalability at historian level, you have to deal with scalability at data streaming level, and you have to deal with transforming your data. When you do all these functionalities, you don't want to lose any data. So you have to take care of sustainability and scalability. 

The fourth is maintenance and security. At this kind of big company, they don't want problems with security. So the first step you put in the company is the best possible security because the more you're gonna put edge devices on the company, the more you are in a hole getting larger and larger if you don't care about it at first.

Maintenance, of course, I will talk about it a bit later, but, last but not the least is people and training. When you develop a software for big company like this, you have to onboard people because otherwise they're getting frustrated because they don't know how to use it. 

People are key in this kind of project. And last, but I guess it's our society. The industry is like our society now. They want it now. They want the best architecture possible with best user experience possible, but I want it now and I want all my site in one year, please. So, that's how it works and you have to deal with it. You have to take the best decision possible to make it happen. So you have to start to think about your edge device. I mean, edge devices are the key of your architecture. And to build this kind of architecture, you have to think at a global level. That means you have to first think about security, of course, but the second is versioning. Versioning and security go together. And for this particular, problem, you have different tech stacks. And one I liked very much was balena.io. If you don't know balena.io - it's quite impressive to use, because you centralize your operating system into one platform and you can push updates whenever you want.

And believe me, if you, finally do it one by one, you're going to waste a lot of time as you configure your firewall. 

Okay. We have talked about security. We have talked about version management, which is not about your operating system, it's also about your application, your edge application, the way you're gonna get data from the PLC, for example, to your cloud platform. 

You need to also take care about versioning and how you're going to deploy to hundreds of devices when you find something better or you correct a bug or you just correct security. And for this kind of use case, we use of course our own software, but if you want to do it yourself with an open source software, I think you should use FlowFuse. FlowFuse is like the way you centralize,  your application, your edge application, to one platform, and you can deploy it in the same time to any of your devices. 

The third challenge is UNS. So we talked a lot about it before. Here is an example of UNS we have implemented in this kind of use case. So you have, of course, the geographical division. Then there is also the machine, the time series sensors, and also the ERP information. So, for me, to be honest, UNS is like asking your kids to clean their room. Some kids will just throw all the toys into a big box, but some will sort each type of toy into its own small box so they can find them faster later. UNS is exactly the same You can organize your data however you want, but you need to think about how you'll need to access it faster. So, clean your data based on your use case and the needs of the end-user. Otherwise, it's like cleaning someone else's room without knowing how they want to get their toys. 

So we made it like this and it was quite based on the customer's needs. We studied a lot about what they needed. The customers have business knowledge. As a software editor, we don't have this kind of business knowledge. We just know how to build stuff but, we don't know how they are going to use it in a real use case.

So UNS is very important because it's gonna make your data work together on the cloud. So for me, it's better to do it on the edge, of course. And UNS is not only about cleaning your data. There are two notions which are very important. It's sampling and hysteresis, because, believe me, you don't want to use useless data on the cloud. To get rid of useless data, you have to fix sampling and hysteresis in the best way possible. Otherwise, you're gonna be flooded with useless data and useless data is a mess at the end. I will talk a bit more about store and forward because it's also key when you talk about distributed architecture. 

And the last one is monitoring. You need to monitor your edge device. Without your edge device, the top won't work. You have to monitor your edge device the same way you monitor the machine. Otherwise, it won't work at all. I guess that's it. 

Let me explain with an example of an architecture we made for an edge device. 

Architecture diagram from the talk – Exploring Real-World MQTT-Based IIoT ArchitecturesImage source: Event Presentation | Image credit:Jean Romain Bardet

On the left, you have level-one edge protocols, which are pretty much classical. In the middle, you have a UNS that cleans the data. As I said, it cleans the data because there is a template for each machine you're going to connect. You made it on the cloud and you pushed it into your edge device as well as the sampling and hysteresis, which are very, very, very important as I said before. 

Some people say, MQTT is not good for store and forward technology. I mean, if your broker is not on the edge device, I don't get how you can do store and forward with MQTT.

So, for me, MQTT goes with Kafka

Architecture diagram from the talk – Exploring Real-World MQTT-Based IIoT ArchitecturesImage source: Event Presentation | Image credit:Jean Romain Bardet

Kafka is kind of complicated, but it's very very powerful. If your backend services are cities and MQTT are the road, then Kafka is a highway. You know, you have to pay for it, but you can go faster on the third lane, but also slower in the first lane. So, I think it goes together, to be honest. So, you have Kafka in the middle of PubSub. That means if you break this link, everything goes to Kafka and waits, then synchronize when you get back the link. It does work well because we did it. In the hundreds of devices we have, we depend on the bandwidths and other stuff and it works well. It was one of the best decisions we made because we have tried like SQL server or any database on the edge device. It's quite complicated to make it work. And I'm also MQTT Sparkplug B maximalist. So, to publish the data to the broker, which is a scalable broker like HiveMQ, we use MQTT Sparkplug B. 

I have a ton of stories about MQTT Sparkplug B, but I don't know if I have got the time yet or not. Anyway,  why MQTT Sparkplug B? Because, of course, the birth and death mechanism is very powerful. Also because MQTT likes bandwidth, of course. And, the single source of truth is very powerful, of course. But on the top you have a scalable MQTT broker. 

Why did we make it scalable? Because when you build software, like we built, we think about thousands of devices connected. We don't want to be close to like 100 devices and that's it. We have to recreate a broker, a new broker with new address, etc., etc. So we went with a scalable broker.

And on the top, you have on the left, of course, the MQTT broker and Kafka in the middle. Kafka is like the backbone of our cloud solution. That means you get data from each microservices with Kafka. So you get standard data from it. We also have tons of connectors, which are not MQTT, obviously, it could be REST API, and connectors, S-Q-L-G-D-B-C connectors, and any connector which connects to Kafka.

Kafka is like a toolbox. So it's quite powerful. Every message is going through Kafka and goes to each microservices. We have like real-time data streaming, microservices, historians of course data services and, transform microservices because even if you deal with clean data,  you want to transform it some time, like you want to add alarms, you want to transform data to get use case on it. So you have to get one microservice to transform the data before sending it to the historian to the real-time data streaming. For the scalable historian DB, of course, there are multiple technologies. It depends on the scale of the team. If your team is more about SQL, you can choose a timescale DB, which is great. But if you prefer Influx DB, you can go with it. it's up to your team. We're good at it because TimescaleDB is a French database provider, which is kind of great. So don't bother to challenge to benchmark too much, because this technology is very good.

So in this kind of architecture, you have to think about – when I have a microservice, who's gonna get down, how I gonna deal with the message. If you lose any data or something goes wrong and you lose data, your customers are going to go to you and say, I have no data during this time. What happened? So, you are facing scalability issues and you have to work on it. So, to scale across thousands of sites, we use Kubernetes, which is a bit like Kafka. It's expensive at first, but very, very powerful. So basically it makes your microservice resilient. That means, for example, if I take this microservice here, it does scale with the charge on it. So if there are a lot of people working on it, it's going to scale until the charge is fine. Then it's gonna downscale until the charge is gone. Even if it fails, another port gonna pop up and it will work fine. This makes your architecture sustainable and scalable. Otherwise, you just have trouble with losing data. Kafka is very sustainable because if she doesn't deliver a message, then the message stays in the queue and it's fine.

The results – after eight months, we deployed 10 sites, which was great to be honest. And after 12 months, 25 sites were deployed and this year, we aim to deploy all sites around the 25 countries where they operate. At the moment, we have 100 active users, which is a lot for monitoring and control systems. And I'm quite proud of it when I check on the software, we see that there are like 10 or 20 people from the customer company working on it. It means it is working well and we have very, very good feedback.

And the two use cases here. The first one was monitoring the pollution into the air. It's quite simple as a use case. But after a few months, they made their own use case, which is the energy consumption of each cycle of industrial ovens. I value this use case because you deal with millions of dollars worth of capabilities and functionalities. Getting this use case is really, really great. That's why what I meant is they started small and now they are thinking big because they have the tool to do it. So when you provide the right tool with the right user experience, you can expect the right use case.

Thank you Kudzai to let me talk and thank you to let me talk to at this kind of event. Sorry for my English, it's not that good. I did my best, and if you want to talk to me, I would be proud to explain it better in French.

Conclusion

The journey to effectively scaling an MQTT architecture across various industrial locations involves navigating a complex landscape of technical challenges and strategic decisions. The insights provided here, based on a practical case study, offer valuable lessons on enhancing system robustness, improving data handling, and ensuring security at scale. As industrial needs continue to evolve, so too must the architectures that support them. We hope this exploration aids in your efforts to implement more resilient and efficient MQTT systems in your own operations.

HiveMQ Team

The HiveMQ team loves writing about MQTT, Sparkplug, Unified Namespace (UNS), Industrial IoT protocols, IoT Data Streaming, how to deploy our platform, and more. We focus on industries ranging from energy, to transportation and logistics, to automotive manufacturing. Our experts are here to help, contact us with any questions.

A2A for Enterprise-Scale AI Agent Communication: Architectural Needs and Limitations

Learn how A2A communication powers scalable AI agents, the architectural limits to watch, and what’s needed to enable the autonomous enterprise.

Blog

AI at Scale: Rethinking Data Centers as a Data Problem

Discover how scaling AI shifts the challenge from compute power to data infrastructure, and why solving the data problem is key to future-ready data centers.

Blog

How CxOs Can Build the Foundation for AI-Ready Industrial Operations

Struggling to scale Industry 4.0? Discover why CxOs are rethinking industrial architecture with MQTT, UNS, and real-time data to power AI-driven operations.

Blog

Understanding MQTT Message Ordering

Understand how MQTT and Sparkplug handle message ordering, especially in clustered environments. Learn how HiveMQ ensures order and reliability.

Blog

Solving Common Industrial IoT Data Streaming Challenges with MQTT

Struggling with real-time IIoT data flow? Learn how MQTT solves legacy integration, scaling, reliability, and security challenges in manufacturing.

Blog

Deploying Real-World UNS Architectures with MQTT and Node-RED

Explore real-world UNS architectures using MQTT and Node-RED, as discussed in CONNACK Episode 4 featuring insights from Mayker.

Blog

It’s Your Time to Shine: Apply to Win an MQTT Innovation Award

HiveMQ Innovation Awards 2025 are back. A tribute to what our customers have achieved with the HiveMQ platform & MQTT across industries.

Blog

Unified Namespace for OT/IT Integration with MQTT and WinCC Open Architecture

Discover how Unified Namespace with MQTT and WinCC OA enables seamless OT/IT integration, real-time data flow, and structured industrial data modeling.

Blog

Unified Namespace and AI Agents: Game-Changer for Manufacturing or Just Hype?

Is UNS + AI a revolution or just hype? Experts from MaibornWolff weigh in on what this combo means for smart factories in CONNACK Episode 4.

Blog

A Step-by-Step Guide to Connecting Ignition to MQTT and HiveMQ

Learn how to connect Ignition to MQTT and HiveMQ to enable secure, real-time industrial data streaming and bridge OT and IT systems with ease.

Blog

Overcoming Data Chaos in Smart Manufacturing with Real-Time Data Intelligence

Overcome data chaos in smart manufacturing with real-time data intelligence. Discover how HiveMQ’s Pulse and UNS drive agility, insights, and efficiency.

Blog

The Roadmap to Building an AI-Ready Data Foundation in Manufacturing

Lay the groundwork for AI in manufacturing with a clear data strategy. This roadmap guides you from data chaos to smart, scalable innovation.

Blog

Building a Unified Namespace: Why MQTT Outperforms NATS

Is NATS a fit for UNS? Explore how its fast, flat pub/sub model compares to MQTT’s hierarchy, and where each excels in Unified Namespace design.

Blog

Enabling a Scalable Industrial Data Architecture for AI-Ready Manufacturing

Build a scalable industrial data architecture to unlock AI-ready manufacturing, improve data flow, and future-proof your smart factory operations.

Blog

Distributed Data Intelligence in Manufacturing: The Path, Benefits, and Pitfalls

Explore how distributed data intelligence transforms manufacturing: strategies, benefits, and pitfalls on the path to smarter, data-driven production.

Blog

Building a Unified Namespace: Why MQTT Outperforms AMQP

Evaluate why AMQP falls short for building a Unified Namespace, with its complex routing and overhead, compared to MQTT’s lightweight, topic-based model.

Blog

Building Digital Resilience with UNS and Distributed Data Intelligence

Discover how to fortify your digital systems with UNS and distributed data intelligence. Enhance resilience, mitigate risk, and drive growth.

Blog

Introducing the HiveMQ UNS Maturity Assessment

Assess your data architecture readiness for Unified Namespace with HiveMQ UNS Maturity Assessment. Benchmark against industry, get a report & roadmap.

Blog

Harnessing the Value of Real-Time Data Streaming in Data Centers

Learn how to unlock real-time data streaming to power data centers with instant infrastructure visibility, optimizing energy & operational efficiency.

Blog

Digital Transformation in Oil & Gas With UNS, MQTT & Distributed Data Intelligence

Explore key Oil & Gas use cases where UNS, MQTT & distributed data intelligence enable smarter operations, real-time insights & digital transformation.

Blog

CONNACK! 2025: Exploring the Convergence of Unified Namespace and AI

Join us at CONNACK! 2025 in Munich to explore how Unified Namespace and AI are shaping the future of industrial data. Learn, connect, and innovate!

Blog

Enabling Real-Time OEE and Production Analytics with a Unified Namespace (UNS)

Learn how a Unified Namespace enables real-time OEE tracking and production analytics by unifying machine, operator, and system data into one source.

Blog

Enabling Digital Work Order Management with a Unified Namespace (UNS)

Discover how a Unified Namespace enables real-time, digital work order management by connecting ERP, MES, PLCs, and shop floor operations.

Blog

Enabling Efficient Warehouse Operations with a Unified Namespace (UNS)

Discover how a Unified Namespace (UNS) enables real-time, resilient warehouse operations by integrating WMS with ERP, MES, and APS systems.

Blog

UNS for Enabling Resilience and Agility in Shop-Floor Scheduling

Enable resilient & agile shop-floor scheduling with UNS by connecting ERP, APS & MES for real-time decisions, faster updates and streamlined operations.

Blog

Digital Transformation in Pharma Manufacturing With UNS, MQTT & Distributed Data Intelligence

Transform pharma manufacturing with UNS, MQTT & Distributed Data Intelligence. Break silos, boost compliance, and drive Pharma 4.0 innovation.

Blog

Unified Namespace for a Consolidated View of Production Order Status

Explore how UNS consolidates data from ERP and shop-floor systems into a single, flexible platform to provide real-time view of production order statuses.

Blog

Digital Transformation in Chemical Manufacturing With UNS, MQTT & Distributed Data Intelligence

Explore how Distributed Data Intelligence, built on UNS and powered by MQTT, can transform chemical manufacturing with seamless, real-time insights.

Blog

Insights from ProveIt! 2025: Unified Namespace Solutions in Action

Discover key insights from ProveIt! 2025, where vendors showcased real-world IIoT solutions in action, proving feasibility on stage with Unified Namespace.

Blog

HiveMQ vs. AWS IoT Core: A Comparative Analysis for IoT Messaging

Compare HiveMQ vs. AWS IoT Core on scalability, MQTT compliance, integrations, security, and vendor lock-in to choose the best IoT messaging platform.

Blog
HiveMQ logo
Review HiveMQ on G2