Skip to content

Building Industrial Digital Twins on AWS Using MQTT Sparkplug

by Kudzai Manditereza
21 min read

A Digital Twin is a digital representation of a physical asset, process or system that helps industrial companies optimize production processes and prevent asset failures by predicting them in advance. Moreover, industrial companies want to use Digital Twins to improve operational planning by getting insights into what parts may need repair and being able to monitor an entire fleet of assets, simulate future scenarios and make comparisons to improve the overall efficiency of their equipment.

In this article, I will cover the fundamentals of Digital Twins, challenges in building them, how MQTT Sparkplug simplifies their creation and management, and then provide an architecture for Sparkplug integration into AWS Cloud. To provide context, I’m going to use an example of an energy company using Digital Twins to manage a fleet of Wind Turbines through virtualisation and generating prediction analytics.

Understanding Digital Twins

Due to the youthfulness of its application in industry, it’s easy to get bogged down by the debate on what a Digital Twin is and what it isn’t. However, when stripped of the finer details, a Digital Twin can be achieved by creating a digital model of a physical asset, using that model to create a digital instance or copy of the real physical asset, and then updating the digital copy with real-time information to virtually represent the current state of its physical counterpart. Taken a step further, the real-time and historical data of a digital twin can be used to build machine learning models that could help to accurately predict what will happen to the real-world physical asset of that digital twin.

It is against this backdrop that I will henceforth discuss how MQTT Sparkplug enables the development of a Digital Twin solution. But before we go any further, let us clarify what is a physical asset as it relates to Digital Twins. Using our example of a Wind Turbine, what could be referred to as a physical asset could be a component of a Wind Turbine System, such as its gearbox. Or it can be the whole Wind Turbine System itself, or it can be a Wind Farm with multiple Wind Turbine Systems. For purposes of this article, we will assume that our physical asset is the Wind Turbine System.

Turbine System PropertiesThe diagram above illustrates the elements of a physical Wind Turbine System. On it, you can see that a physical Wind Turbine System is composed of properties that can be classified as either static or dynamic. Static properties are those that contain information about the Wind Turbine that never or rarely changes; for example, the firmware version might get updated once in months or years.

On the other hand, dynamic properties are those that contain Wind Turbine information that is constantly being generated as telemetry data, sometimes with a resolution as small as a few milliseconds. Further, dynamic properties would typically have their own nested static properties as well, for example, the Turbine Speed parameter could have properties such as units of measurement (RPM), and high_speed and low_speed thresholds.

So to build a Digital Twin for a Wind Turbine System, you should is to first create a digital model, or in simple terms a data structure, that represents both the static and dynamic properties of the real physical Wind Turbine, in an abstract manner. As both Wind Turbine 1 and Wind Turbine 2 on our Wind Farm are of the same type, you would then use the same resulting digital model to create concrete representations of them and start updating their properties with real sensor data from their physical counterparts.

Challenges in Building Digital Twins

More often than not, control units for things such as Wind Turbines and other industrial assets are centred around traditional control devices like Programmable Logic Controllers (PLCs) and Remote Terminal Units (RTUs). As you may be aware, a data model on such device platforms is represented as a simple Tag, i.e. a variable name and its value. And this won’t be sufficient to represent a complex asset such as a Wind Turbine in its entirety. The closest to complex system data modelling capabilities on Control System environments are what are called User Defined Types (UDTs), but they are not useful outside of the vendor’s development environment.

What is required is a platform for creating the digital models and corresponding Digital Twin instances, ideally at the source of the data, in such a manner that real-time data about the Digital Twins can be exchanged and contextually understood across different platforms, both in the OT and IT domains, as that would allow for the virtualisation and analysis of the physical asset’s behaviour in its wholeness.

You could try to create the digital model and the twins on an IT platform, send the data from OT as discrete tags, and map it to the Digital Twin instances. But you would be effectively sending the data modelling challenge down the line into the hands of IT personnel, when it could be dealt by OT personnel such as Industrial Automation Engineers. All you want to do in the IT/Cloud domain is to get the data, visualise it, and build machine learning models out of it to start generating prediction analytics for your Digital Twins.

But that’s just part of the story. Today control systems also present a challenge to the creation of Digital Twins because they consist of components that expose their data using a wide range of connectivity interfaces. From register-based connectivity technologies such as Modbus, Profibus and DeviceNet, to Service and Middleware Oriented ones like OPC UA and MQTT. This necessitates the inclusion of a gateway that acts as a data concentrator, gathering discrete tags from the various data sources in order to build up a digital model of a physical asset, and in our case, that of a Wind Turbine System.

Moreover, there are other challenges such as:

  • How to build a Digital Twin that is highly responsive to changes in the physical domain and vice-versa?

  • How to build a system that is always aware of the current state of the physical asset without having to continuously interrogate it?

  • How to automatically discover new Wind Turbine models and their Digital Twins as they get plugged into the system?

  • And last but certainly not the least, how do you ensure that your Digital Twin solution scales unboundedly as you include more and more Wind Turbines?

Using MQTT Sparkplug to Build Digital Twins

In this section, I’ll discuss how MQTT Sparkplug helps you build a Digital Twin solution by overcoming the stated challenges.

MQTT Sparkplug is an interoperability protocol specification that provides MQTT clients the framework to seamlessly integrate data from industrial sources within the MQTT infrastructure in a bi-directional and interoperable way. And key to building Industrial Digital Twins, is its definition of an MQTT Topic Namespace, Payload Representation and Session State Management.

Even better, a Sparkplug solution is built around an event-based and publish-subscribe architectural model that uses Report-By-Exception for communication. Meaning that your Digital Twin instances get updated with information only when a change in the dynamic properties is detected. Firstly, this saves computational and network resources such as CPU, memory, power and bandwidth. Secondly, this results in a highly responsive system whereby anomalies picked up by the analytics system can be adjusted in real-time.

Further, due to the underlying MQTT infrastructure, a Sparkplug based Digital Twin solution can scale to support millions of physical assets, which means that you can keep adding more assets with no disruptions. What’s more, MQTT Sparkplug’s definition of an MQTT Session State Management ensures that your Digital twin Solution is always aware of the status of all your physical assets at any given time.

A typical Sparkplug network is shown below.

SCADA IIoT Host Sparkplug Enabled ChartBecause Sparkplug defines a payload format that may contain one or more metrics that have key-value pairs of data, it can be used for holding properties of a digital model of our Wind Turbine. Further, Sparkplug allows for each metric to include properties associated with it, and these can be used to include key-value pairs for representing a metric’s metadata, such as units of measurement for the Turbine Speed property.

But the good news is, your engineering experience with Sparkplug will rarely involve dealing with the underlying implementation details. Rather, Sparkplug allows you to use configuration tools on platforms to build digital models based on its MQTT payload definition. For example, in Inductive Automation’s Ignition platform, you can use the graphical user interface and native User Defined Types (UDTs) to build your digital model, and the platform takes care of mapping the UDT model to your Sparkplug MQTT payload structure. Below is an example of what our Wind Turbine digital model definition would look like on the Ignition platform.

Wind Turbine Digital Model on the Ignition platformAnd it doesn’t end there. Once you have created your Wind Turbine digital model on a platform like Ignition, you can then go on to configure instances of your UDTs for each physically existing Wind Turbine in operation which essentially become your Digital Twin that you can start updating with real-time data.

What’s more, Sparkplug allows you to broadcast your digital model and its corresponding digital twins, from the source of the data at the edge, to your MQTT network. Such that, other applications that are connected to the same Sparkplug compliant MQTT Broker like the HiveMQ Broker, can automatically discover their existence and where they originate from within your infrastructure based on the Topic Namespace.

Simply stated, your Digital Twin consumer application goes from not knowing anything about a Wind Turbine, to receiving its abstract digital representation, followed by concrete implementations of all Wind Turbines based on that abstract representation, as soon as they are configured at the edge.

Now the question is how do you feed the Digital Twins with a stream of up-to-date information from downstream legacy devices that are producing it? The answer is, Sparkplug allows for a broad class of devices to participate in its network, primarily through what is termed an MQTT Edge-of-Network (EoN). An EoN may be a physical gateway that enables you to poll legacy devices that expose register-based data using protocols like Modbus, or it may be an MQTT enabled device or sensor directly attached to your Wind Turbine.

An example of a Sparkplug EoN is a device such as a Groov EPIC controller which, by its very nature, can collect and publish data to a Sparkplug infrastructure. But more important is the fact that it embeds an Ignition platform that can collect data using its multitude of industrial device protocol drivers and map the discrete data onto its UDT elements to update the Digital Twins. Even better, there are industrial DataOps oriented tools like the HighByte Intelligence Hub that you can use to create digital models, connect to legacy and modern industrial devices or databases to further enrich your models, and then publish instances of your models to an MQTT broker using Sparkplug.

AWS Integration of MQTT Sparkplug Solution

Best of all, to derive value as promised by the Digital Twin concept through virtualisation, simulation and predictive analytics, Sparkplug is compatible with AWS IoT Sitewise which serves as a crucial gateway to a plethora of visualisation, and analytics tools on the AWS Cloud platform. This can be achieved by using Cirrus Link’s IoT bridge for Sitewise to publish Sparkplug digital models and digital twins to the AWS Sitewise platform.

AWS Cloud Digital TwinsTo create digital models and corresponding digital twins on AWS IoT Sitewise, you’d have to first manually create a digital model, and then if you have, say 1000 Wind turbines, manually create 1000 instances of your Wind Turbine digital model. With Sparkplug, your digital models and twins automatically appear on Sitewise as they are being connected up at the edge. This significantly reduces integration time on the Cloud/IT end, and lets IT staff concentrate on what they know best, to manage and bring your Digital Twins value proposition to life.


MQTT Sparkplug simplifies the creation of industrial digital twins by solving the core challenges involved, which are Data Modelling, Connectivity and Semantic Interoperability. Moreover, it helps introduce factors that are crucial to consider for a Digital Twin ecosystem such as Efficiency, Ease of Integration, Auto-Discovery and Scalability. For a detailed tutorial on how MQTT Sparkplug works, you can check out the MQTT Sparkplug Essentials article series.

Check out the video below that provides the summary of this blog

  • 02:51 - Challenges in Building Digital Twins
  • 04:42 - Using MQTT Sparkplug to Build Digital Twins
  • 07:54 - AWS Integration of​​ MQTT Sparkplug Solution
  • 08:49 - Conclusion

Kudzai Manditereza

Kudzai is a tech influencer and electronic engineer based in Germany. As a Developer Advocate at HiveMQ, he helps developers and architects adopt MQTT and HiveMQ for their IIoT projects. Kudzai runs a popular YouTube channel focused on IIoT and Smart Manufacturing technologies and he has been recognized as one of the Top 100 global influencers talking about Industry 4.0 online.

  • Kudzai Manditereza on LinkedIn
  • Contact Kudzai Manditereza via e-mail

Related content:

HiveMQ logo
Review HiveMQ on G2