How to Stream Data Between HiveMQ Cloud and Apache Kafka for Free
Written by Margaretha Eber, Shashank Sharma
Category: MQTT IoT HiveMQ Cloud
Published: February 9, 2023
According to a study published by Statista, IoT devices will produce 79 zettabytes of data in 2025, which will be a 483% increase from 2019. To put this number into perspective, if we store this information in smartphones with a storage of 128 GB each, we would need 617.1875 billion smartphones. Yet, without further processing, this data is worth almost nothing. Only by transforming and analyzing this data do you unlock the immense added value promised by the Internet of Things (IoT).
A common question is how do you actually process the data collected from IoT devices. There are several ways to do this, but one of the most compelling is using MQTT protocol to send IoT data via Apache Kafka for further processing in a system of your choice.
MQTT and Apache Kafka are often used together to enhance the functionality of IoT and Machine-to-Machine communications. You commonly see them married in the following use cases:
- Data collection: MQTT is used to collect data from IoT devices and publish it to a Kafka broker, where it is processed, analyzed, and stored for future use.
- Real-time processing: Using MQTT and Kafka, organizations build real-time data processing pipelines that handle large amounts of incoming data from IoT devices.
The easiest way to process data from your IoT devices to your Kafka service is our newly introduced Kafka integration with HiveMQ Cloud.
In this blog, we will show you the key capabilities of the Kafka-HiveMQ Cloud integration, how you can use it to stream your data, and walk you through how you can set it up.
The HiveMQ Cloud Kafka Integration
Before we jump into the step-by-step instructions, let’s look at the benefits of the Kafka-HiveMQ Cloud integration. This simple configuration enables you to stream your data efficiently between your HiveMQ Cloud broker and your Kafka cluster for bidirectional message exchange without ongoing operational burden.
There are five (5) easy steps to ingest data from your IoT devices with the Apache Kafka service of your choice. These can be broadly divided into
- Connection Configuration parameters
- Topic mapping parameters
The connection configuration parameters help establish a secure connection between HiveMQ Cloud and your Apache Kafka cluster. The topic mappings let you set up the bidirectional data flow between your MQTT cluster and Apache Kafka.
But first you must find the Kafka extension in the “Integrations” tab inside your HiveMQ Cloud cluster. This integration is available with HiveMQ Cloud.
Note: if you are using the free version of HiveMQ Cloud for the first time to follow these instructions, you can start without adding any payment information.
Now you are ready to dive into the five steps:
- Connect HiveMQ Cloud with the Kafka service of your choice: To connect, you need a list of bootstrap servers for your Kafka cluster so the integration can fetch the initial metadata about your Kafka cluster.
- Secure the connection: Now you need to add your Kafka credentials. This helps ensure there is a secure connection between HiveMQ Cloud and Kafka.
We offer two different SASL mechanisms for connection security.
- Send data from HiveMQ to Kafka: Once you set up and secure the connection, you can choose what data to forward from your IoT devices. This requires mapping topics from HiveMQ Cloud to your Kafka cluster. The source topic is the MQTT topic you want to send from your HiveMQ cluster. The destination topics are the Kafka topic receiving the messages that your HiveMQ cluster sent.
- Establish bidirectional communication: For bidirectional communication between Kafka and HiveMQ, you can configure the Kafka cluster to HiveMQ Cloud similarly as you define the topic mapping from HiveMQ Cloud to your Kafka cluster. In this case, the source topic represents the Kafka topic from which the integration should read messages. These messages are then published with the defined destination topic on your HiveMQ Cloud MQTT broker cluster.
- Enable the configuration: You can start the data flow between the HiveMQ cloud cluster and your Kafka cluster by selecting the “enable” button.
If you’ve followed these five steps, you should now be able to employ Apache Kafka with HiveMQ Cloud to use data from your IoT devices for bidirectional communication.
To access the Kafka-HiveMQ Cloud functionality for free, all you need to do is sign up.
The integration is a lightweight version of our HiveMQ Enterprise Extension for Kafka and offers to solve frequently requested use cases. If you are still missing functionality, don’t hesitate to reach out to us. We are always keen on direct user feedback.