Skip to content

Building an elastic high availability MQTT broker cluster on AWS

by Florian Raschbichler
12 min read

HiveMQ is a cloud-first MQTT broker with elastic clustering capabilities and a resilient software design which is a perfect fit for common cloud infrastructures. This blog post discussed what benefits a MQTT broker cluster offers. Today’s post aims to be more practical and talk about how to set up a HiveMQ on one of the most popular cloud computing platform: Amazon Webservices.

This post has been updated for HiveMQ 4 and the use of AWS Network Load Balancer has been added.

Running HiveMQ on cloud infrastructure

Running a HiveMQ cluster on cloud infrastructure like AWS not only offers the advantage of providing elastic scalability of the infrastructure, it also assures that state-of-the-art security standards are in place on the infrastructure side. These platforms are typically highly available and new virtual machines can be spawned in a snap if necessary. HiveMQ’s unique ability to add (and remove) cluster nodes at runtime without any manual reconfiguration of the cluster allows scaling linearly on IaaS providers. New cluster nodes can be started (manually or automatically) and the cluster sizes adapts automatically.

As Amazon Webservice is amongst the best known and most used cloud platforms, we want to illustrate the setup of a HiveMQ cluster on AWS in this post. Note that similar concepts as displayed in this step by step guide for Running an elastic HiveMQ cluster on AWS apply to other cloud platforms such as Microsoft Azure or Google Cloud Platform.

Setup and Configuration

Amazon Webservices prohibits the use of UDP multicast, which is the default HiveMQ cluster discovery mode. The use of Amazon Simple Storage Service (S3) buckets for auto-discovery is a perfect alternative when the individual HiveMQ broker nodes are running on AWS EC2 instances. HiveMQ has a free pre-built extension available for AWS S3 Cluster Discovery.

The following provides a step-by-step guide how to setup the brokers on AWS EC2 with automatic cluster member discovery via S3.

Setup Security Group

The first step is creating a security group that allows inbound traffic to the listeners we are going to configure for MQTT communication. It is also vital to have SSH access on the instances. After you created the security group you need to edit the group and add an additional rule for internal communication between the cluster nodes (meaning the source is the security group itself) on all TCP ports.

To create and edit security groups go to the EC2 console - NETWORK & SECURITY - Security GroupsInbound trafficInbound trafficOutbound trafficOutbound traffic

The next step is to create an s3-bucket in the s3 console. Make sure to choose a region, close to the region you want to run your HiveMQ instances on.

Create IAM role

Our recommendation is to configure your EC2 instances in a way, allowing them to have access to the s3 bucket.

HiveMQ on AWS

To install 2 HiveMQ broker nodes on 2 EC2 instances on AWS, we utilize the HiveMQ AMI

  1. Launch the AMI in your region of choice.

  2. Select an instance type. We recommend using c5.xlarge for testing purposes.

  3. Configure the instance details
    Configure the instance details

  4. Create 2 instances.

  5. Assign the newly created S3 Full Access role to the instances.

  6. Go to “Configure Security Group”.

  7. Select the Security Group that we just created.

  8. Launch the instances.

This action will automatically spawn two separate EC2 instances that run HiveMQ as a service.

Install and configure HiveMQ S3 Cluster Discovery Extension

Next, we want to enable the cluster mode on both of our HiveMQ instances and provide a way for the instances to discover each other. For this purpose, install the HiveMQ S3 Cluster Discovery Extension

  • Create an S3 Bucket the HiveMQ instances may use.
    Make sure to remember the bucket name. You can use the default configuration.

The following steps need to be done on each individual HiveMQ instance:

  • Connect to the instance via SSH

ssh -i <your-deployment-key> ec2-user@<instance-ip-address>
  • Switch to the root user

sudo su

  • Download the HiveMQ S3 Cluster Discovery Extension

wget https://releases.hivemq.com/extensions/hivemq-s3-cluster-discovery-extension-4.0.1.zip

  • Unzip the distribution

unzip hivemq-s3-cluster-discovery-extension-4.0.1.zip
  • This will create a folder hivemq-s3-cluster-discovery-extension

  • Open the HiveMQ S3 Cluster Discovery Extension configuration file (you may use a different text editor of course)

vi hivemq-s3-cluster-discovery-extension/s3discovery.properties
  • Configure the S3 Bucket region and name

############################################################
# S3 Bucket                                                #
############################################################

#
# Region for the S3 bucket used by hivemq
# see http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for a list of regions for S3
# example: us-west-2
#
s3-bucket-region:<your-region>

#
# Name of the bucket used by HiveMQ
#
s3-bucket-name:<your-bucket-name>
  • Change ownership of the extension folder to the hivemq user

chown -R hivmq:hivemq hivemq-s3-cluster-discovery-extension
  • Move the folder in to the HiveMQ Extension folder

mv hivemq-s3-cluster-discovery-extension/ /opt/hivemq/extensions/

Now that we have the HiveMQ S3 Cluster Discovery Extension successfully installed, let’s adjust the HiveMQ config. Change the /opt/hivemq/conf/config.xml file to look like the following:

<?xml version="1.0"?>
<hivemq>

    <listeners>
        <tcp-listener>
            <port>1883</port>
            <bind-address>0.0.0.0</bind-address>
        </tcp-listener>
    </listeners>

    <cluster>
        <enabled>true</enabled>
        <transport>
            <tcp>               
                <bind-address>IP_ADDRESS</bind-address>
                <bind-port>7800</bind-port>
            </tcp>
        </transport>

        <discovery>
            <extension/>
        </discovery>
    </cluster>

    <anonymous-usage-statistics>
        <enabled>true</enabled>
    </anonymous-usage-statistics>

    <control-center>
        <listeners>
            <http>
                <port>8080</port>
                <bind-address>0.0.0.0</bind-address>
            </http>
        </listeners>
    </control-center>
</hivemq>

Line 15: Enter your EC2 instance’s internal IP address here.

All that is left to do is to restart the HiveMQ Service on both EC2 instances.

/etc/init.d/hivemq restart

The following log statement in the /opt/hivemq/log/hivemq.log file shows successful cluster establishment:

INFO - Cluster size = 2, members : [8Jojp, WlF1S].

Hint: This process can be applied to an arbitrary number of HiveMQ cluster nodes to create clusters of a bigger size than 2 if necessary.

Launch and configure AWS NLB

We are now able to take advantage of rapid elasticity. Scaling the HiveMQ cluster up or down by adding or removing EC2 instances without the need of administrative intervention is now possible. One last step on our way to a true high availability including a load balancer to our setup. This way our HiveMQ broker cluster can act as a single logical broker nodes to MQTT clients. An MQTT clients simply needs to know the load balancers URL to connect, publish, and subscribe. The actual number of HiveMQ broker nodes active in the cluster are irrelevant to the MQTT client.

  1. Go to Target Groups of your EC2 account and click “Create target group”.Go to Target Groups of your EC2 account and click “Create target group”.

  2. Name your target group

  3. Choose “Instance” as type

  4. Select “TCP” as protocol

  5. Choose port “1883”

  6. Select the VPC, your HiveMQ Broker Nodes are running in

  7. Select TCP as health check protocol

  8. Click “Create”Create target group

  9. Select your newly created target group, go to “Targets”, and click “Edit”Select your newly created target group

  10. Select your HiveMQ instances

  11. Click “Add to registered”

  12. SaveSave AWS

  13. Go to Load Balancers and click “Create Load Balancer”Go to Load Balancers and click “Create Load Balancer”

  14. Create a Network Load BalancerCreate a Network Load Balancer

  15. Name your Load Balancer and make it internet-facing

  16. Choose “TCP” and Port “1883”

  17. Configure your VPC and availability zones according to your needs. HINT: It is best practise to choose all availability zones.

  18. Go to “Configure Security Settings”Go to “Configure Security Settings”

  19. Go to “Configure Routing” Hint: We recommend using plain TCP on your load balancer and configure TLS for security on the HiveMQ broker nodes themselves, as none of AWS’ Load Balancer opitons are capable of mutual TLS handshakes.

  20. Select our newly created target group and go to “Register Targets”Select our newly created target group and go to “Register Targets”

  21. Go to “Review” and “Create” the Load BalancerGo to “Review” and “Create” the Load Balancer

That’s it! Once the Load Balancer finished provisioning, we can connect to our HiveMQ Broker Node cluster using the Load Balancer’s DNS name.

For production environments it’s recommended to use automatic provisioning of the EC2 instances (e.g. by using Chef, Puppet, Ansible or similar tools) so you don’t need to configure each EC2 instance manually. Of course HiveMQ can also be used with Docker, which can also ease the provisioning of HiveMQ nodes.

Who we are

We love writing about MQTT, IoT protocols and architecture in general. Our experts are here to help, so reach out to us if we can help!

Florian Raschbichler

Florian serves as the head of the HiveMQ support team with years of first hand experience overcoming challenges in achieving reliable, scalable, and secure IoT messaging for enterprise customers.

  • Contact Florian Raschbichler via e-mail
HiveMQ logo
Review HiveMQ on G2