Skip to content

Critical Update: JVM Memory Bug in Linux Kernel 6.12+ Can Break HiveMQ Startup

by HiveMQ Team

Linux kernel 6.12+ introduces a change that can break how the OpenJDK JVM detects container memory limits. As a result, HiveMQ brokers may fail silently during startup when percentage-based heap sizing (-XX:MaxRAMPercentage or -XX:MaxRAMFraction) is used.

If you deploy HiveMQ inside containers and you plan to upgrade to kernel 6.12 or later, read this article to understand the impact and prepare your environment.

What Changed in Linux Kernel 6.12+

Starting with Linux kernel 6.12, the kernel no longer exposes cgroup controller information through /proc/cgroups in the format the JVM's container-detection logic expects. The JVM relies on this file to determine whether it is running inside a container. However, /proc/cgroups reflects kernel compile-time configuration, not actual runtime cgroup capabilities (JDK-8349988).

When newer kernels disable certain cgroup v1 controllers, the file contents change in ways the JVM does not handle, and container detection fails silently. This behavior causes an inconsistency in how the JVM calculates memory:

  • Heap sizing falls back to host RAM
    The JVM's -XX:MaxRAMPercentage flag fails container detection and uses host memory instead of the container memory limit for heap calculation.
  • Other JVM APIs still read container limits correctly
    OperatingSystemMXBean.getTotalMemorySize() continues to report the cgroup memory limit accurately.

The mismatch creates a conflict: the JVM allocates a heap larger than the container's total memory, which violates internal assumptions in application startup logic.

Important: This change originates in the Linux kernel, not HiveMQ, but directly affects HiveMQ deployments that run in containers on the updated kernel.

How This Impacts HiveMQ Deployments

When the underlying Linux kernel is upgraded to Kernel 6.12+, HiveMQ brokers may terminate during startup without clear error output. During bootstrap, HiveMQ dynamically allocates memory across its internal subsystems to balance performance and resource efficiency. Kernel 6.12 breaks the memory detection logic that this allocation depends on.
When detection fails, the allocation produces an invalid result that causes the broker to exit before startup completes.

NOTE: Non-Containerized Deployments Are Not Affected: This failure mode only impacts containerized deployments that rely on JVM percentage-based heap sizing.
Bare-metal and VM-based deployments are not affected.

Determine if You Are Affected

Follow these steps before updating your Linux kernel or container base images:

  1. Check Your Linux Kernel Version
    Kernel 6.12 or later triggers this issue. Run uname -r on the host or check the release notes of your base image.

  2. Verify Your JVM Heap Configuration
    You are affected if you use -XX:MaxRAMPercentage or -XX:MaxRAMFraction for heap sizing.
    If you use explicit -Xmx / -Xms values instead, you are not affected by this issue.

  3. Check Your JDK Version
    The fix for JDK-8349988 is included in Open JDK versions starting with JDK 21.0.10+.
    Current HiveMQ Docker images ship with JDK 21.0.9, which does not include this fix (unless you have customized your image to use a newer JDK version).

  4. Watch for Silent Startup Failures
    After any kernel upgrade, monitor your HiveMQ broker startup logs carefully.
    The IllegalArgumentException that causes this failure is not caught by HiveMQ's default exception handler. The broker process exits with no visible error in the HiveMQ logs.
    The primary symptom is a broker that terminates immediately after startup with no error output.

    To confirm the root cause, you can temporarily add -Xlog:exceptions=info to your JVM options to surface the exception. However, this flag produces verbose output and is not recommended for long-term production use.

    The exception propagates through Guice as a ProvisionException, injector creation fails, and the broker process exits without logging the error.

Recommended Action

Immediate Workaround: Use Explicit Heap Sizing

HiveMQ Docker images currently bundle JDK 21.0.9, which does not include the cgroup detection fix. Since the JRE is part of the HiveMQ base image, upgrading the JDK in containerized deployments requires a new HiveMQ image release.

Until an updated image is available, replace percentage-based heap sizing with explicit memory values:

    JAVA_OPTS="-Xmx2048m -Xms1024m"
  

The example is for a 4096M container, -Xmx2048m allocates 50% of available memory (equivalent to -XX:MaxRAMPercentage=50). This configuration bypasses the broken cgroup detection path and prevents the negative-memory calculation during broker startup.

Adjust the values to match your container memory limit.

Alternative Workaround: Delay the Kernel Upgrade

If explicit heap sizing is not practical for your environment, remain on Linux kernel 6.11 or earlier until updated HiveMQ images are available. Be aware that staying on older kernel versions may result in missing security and stability improvements.

Recommended Fix: Use Updated HiveMQ Images

The root cause fix requires a JDK build that includes the patch for JDK-8349988. For the JDK 21 line, that is JDK 21.0.10. Updated HiveMQ images bundling JDK 21.0.10 will be available in an upcoming release.

This post will be updated when the new HiveMQ images are available, and the fix will be noted in the relevant release notes.

Important — JDK 21.0.10 Also Changes TLS Cipher Suite Defaults
JDK 21.0.10 is the same update that disables TLS_RSA cipher suites by default, which can cause MQTT client connection failures. Before upgrading, verify that your MQTT clients do not depend on deprecated RSA cipher suites. See Critical Java Security Update: TLS_RSA Deprecation in JDK-21.0.10 for details and workarounds.\

If you are currently delaying the JDK upgrade as a workaround for the TLS_RSA issue, staying on JDK 21.0.9 leaves you exposed to this cgroup detection bug on kernel 6.12+. To resolve both issues simultaneously, upgrade to JDK 21.0.10 and apply the TLS_RSA re-enablement workaround.

Technical Details

The following exception is thrown when the negative memory calculation occurs:

    java.lang.IllegalArgumentException: Physical memory - heap memory must be > 0, calculated value -6336413696

    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)

    at com.hivemq.persistence.local.rocksdb.WriteBufferManagerHolder.getAvailable..
  
The exception propagates through Guice as a (ProvisionException), injector creation fails, and the broker process exits (returns main() ).

Additional Resources

Questions or concerns? Contact HiveMQ. Our team monitors these updates closely and can help you navigate the migration.

HiveMQ Team

Team HiveMQ shares deep expertise in MQTT, Industrial AI, IoT data streaming, Unified Namespace (UNS), and Industrial IoT protocols. Our blogs explore real-world challenges, practical deployment guidance, and best practices for building modern, reliable, and a secure data backbone on the HiveMQ platform, along with thought leadership shaping the future of the connected world.

We’re on a mission to build the Industrial AI Platform that transforms industrial data into real-time intelligence, actionable insights, and measurable business outcomes.

Our experts are here to support your journey. Have questions? We’re happy to help. Contact us.

HiveMQ logo
Review HiveMQ on G2