Skip to content

Identifying, Acquiring and Integrating Plant-Floor Data for Smart Manufacturing

by Kudzai Manditereza
20 min read

The success of modern manufacturing enterprises relies heavily on the ability to collect, analyze, and act upon data. It is, therefore, crucial to pinpoint potential data sources and determine the most efficient methods for acquiring data from predominantly legacy systems to achieve desired outcomes. Ensuring a cost-effective, scalable, and replicable solution is essential. Additionally, it is vital to aggregate the collected data to a level where it can be seamlessly integrated with external enterprise systems.

In manufacturing environments, a diverse range of data is generated from various sources on the plant floor, each carrying unique importance and objectives. To effectively manage this data, it is crucial to initially identify the available information and determine the appropriate methods for accessing it.

This article serves as the second installment in a six-part series titled A Comprehensive Guide To Industrial Data Management for Smart Manufacturing, discussing a practical approach to help you begin implementing data management for smart manufacturing. In part-1 of this series, The Power of Data Management in Driving Smart Manufacturing Success we explored how to establish a well-thought-out strategy for harnessing the power of data in smart manufacturing.

Identifying Potential Data Sources for Smart Manufacturing

Building upon this understanding, we can start identifying potential data sources for smart manufacturing implementation by examining the Computer Integrated Manufacturing (CIM) pyramid.

This reference model, developed in the 1990s, provides a framework for implementing industrial automation. It focuses on collecting, coordinating, sharing, and transmitting data and information between various systems and sub-systems through software applications and communication networks.

Reference Model for Implementing Industrial AutomationReference Model for Implementing Industrial Automation

Below is a list of possible data sources for smart manufacturing implementation and the reasons why.

  • Programmable Logic Controllers (PLCs) at level 2

  • Supervisory Control and Data Acquisition Systems (SCADA) at Level 3

  • Historians at Level 3

Firstly, Sensors, Actuators, RTUs, CNCs, and other field equipment are not suitable as data sources due to their numerous connections to isolated networks managed by outdated protocols. Furthermore, since PLCs primarily collect and provide their data, there is no need to establish direct connections to these components.

On the other hand, PLCs efficiently handle all the data from sensors and devices at lower levels, processing the information based on their respective scan time resolutions, which can be as low as ten milliseconds. This information is typically time-series data but can also include calculations and alarm data. In turn, they make this information available to higher-level applications through communication protocols like Modbus, OPC DA and OPC UA, making great data sources.

SCADA and Historian systems do not gather all the data provided by the connected PLCs. Rather, they focus on collecting the most critical information and data with lower frequency. Initially designed as consumers of industrial data through an OPC client interface, SCADA and Historian systems have also evolved into producers of industrial data. They now fulfill both roles simultaneously by implementing an OPC UA server interface.

Now that we’ve identified our potential data integration sources, let’s examine each source closely, considering their abilities, pros and cons, and connectivity and data-gathering alternatives.

Integrating Data from Programmable Logic Controllers (PLCs)

Within a large industrial facility, numerous PLCs of varying sizes and capabilities can be found, typically arranged in a hierarchical manner. The highest-level PLCs function as data concentrators and are ideal points for data acquisition. Data Concentration PLCs may sometimes relay their information to a standalone OPC UA server, which would then be used as the access point.

In scenarios where you do not have this kind of hierarchical arrangement of PLCs, you’d need to connect to the primary PLC of each working cell, production line, or plant area to collect data.

Integrating Data from PLCsIntegrating Data from PLCs

Pros of Integrating Data from PLCs

PLCs have swift scanning capabilities that ensure a consistent stream of updated data originating from sensors and other production machinery, with a resolution starting at about tens of milliseconds. Most significantly, PLCs stand out for their reliability, stability, and deterministic nature. They guarantee minimal downtime, an essential feature for uninterrupted data collection, thereby avoiding compromised signal quality or disruptions.

Cons of Integrating Data from PLCs

PLCs are situated at the lower levels of the automation pyramid, meaning data collection occurs close to the hardware with limited abstraction. Such a low level of abstraction introduces complexity in managing what could be thousands of process signals, calculations, and alarms. Moreover, tags may have inconsistent naming conventions across various plant areas, depending on the PLC vendor or the control logic developer. Part 3 of this series discusses contextualizing and normalizing this kind of data before integrating it with enterprise systems.

Integrating Data from Supervisory Control and Data Acquisition (SCADA)

SCADA systems can communicate with factory floor field devices using legacy communication protocols, OPC, or Fieldbus protocols. Typically, SCADA applications acquire data at intervals ranging from half a second to one minute. SCADA systems process thousands of signals, utilizing a standardized approach based on essential parameters for accurate information management, often implementing an OPC UA server interface. These parameters include the tag name, a description, the sampling time, a minimum and a maximum value, and engineering units. SCADA systems, therefore, make good access points for acquiring data with some semblance of a data model.

Integrating Data from SCADAIntegrating Data from SCADA

Pros of Integrating Data from SCADA

A SCADA system primarily functions as a data acquisition system, effectively serving as a robust data concentrator. It collects data from various field devices and PLCs, controlling different production lines or functional areas. With SCADA systems adopting a standardized approach to handling industrial data, often based on a common data model, it simplifies identifying and recognizing data streams required for smart manufacturing implementation.

Furthermore, given their frequent need to interface with MES and ERP systems, they are typically already integrated into the plant or corporate network, which eases the process of transferring data to the enterprise level, thereby streamlining overall data management.

Cons of Integrating Data from SCADA

Compared to PLCs, SCADA systems tend to be less reliable due to several factors. Regular updates to SCADA systems often necessitate application restarts and reboots of the Windows operating system for security patches or installations. Additionally, the modular architecture of SCADA systems may lead to overloads that could disrupt communication tasks essential for data integration with enterprise applications.

Further, direct connection to a SCADA system often requires communication via its API or SDK, which may demand substantial effort to maintain, and update connectors for various SCADA systems while ensuring compatibility. Like PLCs, an alternative solution involves connecting to them through an OPC UA server.

Integrating Data from Historians

The Historian’s ability to interface with various common industrial protocols and Fieldbuses enables it to collect data from various plant-floor devices and systems and log it as time-series data. This data is crucial for tracking and analyzing the performance of machines and systems, and detecting anomalies, which makes it a potential source for data integration.

Integrating Data from HistoriansIntegrating Data from Historians

Pros of Integrating Data from Historians

Historians arrange time-series data using a hierarchical model associated with the asset. This model structures data like branches on a tree, creating an organized, logical system for pinpointing and accessing specific data points.

Much like SCADA systems, Historians are often already integrated into the plant or corporate network, simplifying transferring data to the cloud and enhancing overall data management efficiency. However, unlike SCADA systems, Historians come with superior reliability features, such as built-in data buffering capabilities and store-and-forward mechanisms. This in-built resilience to network disruptions or unforeseen downtime ensures the data’s reliability, provided the source device is available.

Cons of Integrating Data from Historians

Historians are designed to optimize data sampling and storage, but they might only gather a portion of the data, limiting the overall data availability. Additionally, data obtained from Historians is in its raw form and does not deliver a time-specific snapshot of an asset - which is an essential aspect of cloud-based analytic applications.

Like PLCs and SCADA, directly connecting to a Historian using its SDK or API poses a maintenance challenge. Again, the alternative could be to connect to it through an OPC UA server.

Integrating Plant-Floor Data to The Enterprise

When gathering data from established industrial systems, you’re bound to accumulate information from diverse sources based on the advantages and disadvantages I’ve previously mentioned. Regardless of the device acting as a data source, a software layer must be in place that communicates with this source via its unique protocol. This layer should be able to request Tag and Time-Series data, making it accessible to higher levels and external systems through a single standardized interface, typically OPC UA.

Integrating Plant-Floor Data to The EnterpriseIntegrating Plant-Floor Data to The Enterprise

Direct connections to PLCs, CNCs, SCADA, Historians, etc., mean your Edge IT infrastructure has to set up and manage multiple protocol endpoints. This approach is neither scalable, secure, nor reliable. Instead, it’s preferable to utilize a connectivity platform to simplify the management of diverse interfaces and protocols, thereby enhancing data integration efficiency.

KEPServerEX, for instance, could be an appropriate choice for efficient data management. It can interact with various devices and machines, irrespective of the manufacturer, and supports numerous communication protocols. Its unique feature exposes your plant-floor data for enterprise integration through a single interface, OPC UA. This significantly eases the process of integrating data into enterprise applications.

Data in a standardized communication interface like OPC UA can be integrated into the enterprise network using a communication protocol like MQTT. MQTT is ideal for this stage of integration for several reasons:

Scalability: MQTT’s publish/subscribe model is scalable, especially when dealing with many devices, making it a good fit for extensive manufacturing setups or those expected to grow significantly.

Cloud Integration: MQTT is a popular choice for cloud integration due to its native support on many IoT platforms. If manufacturing data needs to be integrated with a cloud platform, converting it to MQTT can simplify this task.

Real-time Data Processing: MQTT is suitable for real-time data processing thanks to its lightweight and real-time capabilities. This is advantageous in situations where immediate insights and swift decision-making are crucial.

To enhance the quality and speed of real-time insights, it’s imperative that we first contextualize, normalize, and model the acquired and aggregated data before integrating it. Furthermore, to establish the comprehensive context necessary for implementing smart manufacturing, it’s crucial to integrate your plant floor data with various platforms. These include Manufacturing Execution Systems (MES), Enterprise Resource Planning systems (ERP), and Laboratory Information Management Systems (LIMS), among others. As such, the necessity of introducing DataOps to operationalize data management at this layer becomes clear. This topic will be our main focus in Part 3 of this series.


In this article, we have navigated through identifying data sources and integration opportunities as a first step to implementing a data management strategy for smart manufacturing. We addressed identifying sources of plant-floor data and the procedures for data acquisition. We also highlighted the methods to integrate this data, making it accessible via a standardized interface for enterprise integration.

In Part 3 of this six-part series titled A Comprehensive Guide To Industrial Data Management for Smart Manufacturing, we delve into the methods of transforming, standardizing, normalizing, and modelling the collected data. This process is crucial to ensure that the data can be correctly understood and interpreted, thus enhancing its quality and usefulness.

Watch Part 2 of our Data Management for Smart Manufacturing Series video series.

Kudzai Manditereza

Kudzai is a tech influencer and electronic engineer based in Germany. As a Developer Advocate at HiveMQ, he helps developers and architects adopt MQTT and HiveMQ for their IIoT projects. Kudzai runs a popular YouTube channel focused on IIoT and Smart Manufacturing technologies and he has been recognized as one of the Top 100 global influencers talking about Industry 4.0 online.

  • Kudzai Manditereza on LinkedIn
  • Contact Kudzai Manditereza via e-mail
HiveMQ logo
Review HiveMQ on G2