MQTT Essentials Part 10: Keep Alive and Client Take-Over

mqttessentials_part10

Welcome to the tenth part of the MQTT Essentials, a blog series about the core features and concepts in the MQTT protocol. In this post we will cover the Keep Alive feature of MQTT and why it is especially important for mobile networks.

Problem of half-open TCP connections

As we already know MQTT is based on TCP and that includes a certain guarantee that packets over the internet are transferred “reliable, ordered and error-checked”. Nevertheless it can happen that one of the communicating parties gets out of sync with the other, often due to a crash of one side or because of transmission errors. This state is called a half-open connection. The important point is that the still functioning end is not notified about the failure of the other side and is still trying to send messages and wait for acknowledgements.

The problems with half-open connection increase in mobile networks as the following citation from Andy Stanford-Clark, inventor of the MQTT protocol, explains:

Although TCP/IP in theory notifies you when a socket breaks, in practice, particularly on things like mobile and satellite links, which often “fake” TCP over the air and put headers back on at each end, it’s quite possible for a TCP session to “black hole”, i.e. it appears to be open still, but in fact is just dumping anything you write to it onto the floor.

Andy Stanford-Clark on the topic “Why is the keep-alive needed? (Source)

MQTT Keep Alive

In order to work around this issue of half-open connection or at least give a possibility to access if the connection is still open, MQTT provides the keep alive functionality.

The keep alive functionality assures that the connection is still open and both broker and client are connected to one another. Therefore the client specifies a time interval in seconds and communicates it to the broker during the establishment of the connection. The interval is the longest possible period of time, which broker and client can endure without sending a message.

The MQTT specification says the following:

It is the responsibility of the Client to ensure that the interval between Control Packets being sent does not exceed the Keep Alive value. In the absence of sending any other Control Packets, the Client MUST send a PINGREQ Packet.

That means as long as messages are exchanged frequently and the keep alive interval is not exceeded, there is no need to send an extra message to ensure that the connection is still open.

But if the client doesn’t send any messages during the period of the keep alive it must send a PINGREQ packet to the broker to confirm its availability and also make sure the broker is still available.

The broker must disconnect a client, which doesn’t send PINGREQ or any other message in one and a half times of the keep alive interval. Likewise should the client close the connection if the response from the broker isn’t received in a reasonable amount of time.

Keep Alive Flow

Let’s have a look at the keep alive messages in detail. There are two messages involved in the keep alive functionality.

PINGREQ

pingreq

The PINGREQ is sent by the client and indicates to the broker that the client is still alive, even if it hasn’t send any other packets (PUBLISH, SUBSCRIBE, etc..). The client can send a PINGREQ at any time to make sure the network connection is still alive. The PINGREQ packet doesn’t have any payload.

PINGRESP

pingresp

When receiving a PINGREQ the broker must reply with a PINGRESP packet to indicate its availability to the client. Similar to the PINGREQ the packet doesn’t contain any payload.

Good to Know

  • If the broker doesn’t receive a PINGREQ or any other packet from a particular client, it will close the connection and send out the last will and testament message (if the client had specified one).
  • The MQTT client is responsible of setting the right keep alive value. For example, it can adapt the interval to its current signal strength.
  • The maximum keep alive is 18h 12min 15 sec.
  • If the keep alive interval is set to 0, the keep alive mechanism is deactivated.

Client Take-Over

A disconnected client will most likely try to connect again. It could be the case that the broker still has an half-open connection for the same client. In this scenario the MQTT will perform a so-called client take-over. The broker will close the previous connection to the same client (determined by the same client identifier) and establishes the connection with the newly connected client. This behavior makes sure that half-open connection won’t stand in the way of a new connection establishment of the same client.


So that’s the end of part ten in our MQTT Essentials series. We hope you enjoyed the whole series. This was the last official post, but we have planned a MQTT Essential Special for next week, which will be about MQTT over Websockets. And we have already a lot of great ideas for topics we will cover in the future, so stay tuned for more helpful content about MQTT and HiveMQ.

Have a great week and we’ll hope to see you on the next MQTT Monday!

You want to read more blog post related to MQTT? Then sign up for our newsletter and get notified on each new post as soon as its available. If you prefer RSS, you can subscribe to our RSS feed here.

18 comments

  1. Pitouli says:

    This was super instructive !
    I loved the whole serie: I already had the basics, so part 1 to 5 were more “refreshing” than “necessary” in my case –but are surely an excellent start for a complete beginner– but I learned a lot of things in part 6 to 10.
    I especially loved the “Best Practices”, such as the “online/offline” based on LWT and Retain message…
    Thank you!

  2. Hi
    i was just wondering ; in MQTT-SN too , the client has to form a connection before publishing/subscribing ….isn’t this additional overhead ?

    1. Hi Uwe,

      thanks, we fixed that link!

      Best,
      Dominik from the HiveMQ Team

  3. Christopher Donovan says:

    Great Series, I am a beginner in the realm of IoT and messaging – very informative, well written, short but to the point and easy yo understand the concepts involved.

    Thanks.

  4. Ervin says:

    “It can happen that one of the communicating parties gets out of sync with the other, often due to a crash of one side or because of transmission errors”. Does this mean it was an ungraceful disconnect?

    “The important point is that the still functioning end is not notified about the failure of the other side and is still trying to send messages and wait for acknowledgements”. What about the Last Will and Testament?

    Might I have missed something important?

    1. Hi Ervin,

      the key point here is that a half-open connection looks like an open connection and thus the parties (or one side) thinks the connection is still OK although it’s essentially blackholing. LWT is only sent after a broken TCP connection was detected (which needs the keep-alive in order to circumvent half-open sockets).

      Hope this helps,
      Dominik from the HiveMQ

  5. Nitin Ratnakaran says:

    Thanks a lot for this series. I’ve been searching for martial on MQTT and this blog really explained all the concepts in simple language. Much appreciated.

  6. Oliver E says:

    Thanks for the series! Good intro.

  7. Thanks a lot for this series, I am new for MQTT but reading this series i understand better about MQTT.

    Thanks HiveMQ for this series.

  8. Kyle H says:

    Very informative series.

    Appreciate the work :)

  9. Max Morlock says:

    Is there a reason why the maximum keep alive period is exactly 18h 12min 15sec? Why not 12h or 24h?

    Thanks for the informative series!

    1. Hi Max,

      yes, there is a reason. For the keep alive value the MQTT protocol has allocated a size of 2 bytes. (see http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/errata01/os/mqtt-v3.1.1-errata01-os-complete.html#_Toc385349238)
      This makes it possible to store a number between 0 and 65,535.

      When you convert 65535 seconds into hours, minutes and seconds it is 18h 12min 15sec.

      Best,
      Christian from the HiveMQ Team

  10. Sudheer says:

    Thanks for the detailed series.

    Is there any association between TCP connection and client identifier? I assume broker can store client identifier and TCP connection information in the persistent session, if exists. What if client opted to not having a persistent session?

    1. Hi Sudheer,
      If a client is connected then its client identifier is indirectly associated with its TCP connection.
      If a client does not have a persistent session, the information is removed when the client disconnects or the TCP connection is gone.
      Hope that helps,
      The HiveMQ Team

  11. Sudheer says:

    Thanks for the response HiveMQ Team.

    I’m thinking of possibility to multiplex different client connections to broker using a load balancer. Since client identifier is sent only in the CONNECT packet, I believe it is not possible.
    Please let me know your thoughts on this.

    Thanks,
    Sudheer

    1. Hi Sudheer,

      you are right this is currently not possible.

      Regards,
      The HiveMQ Team.

Leave a Reply

Your email address will not be published. Required fields are marked *