1. Overview
In distributed architectures, applications usually need to exchange data among themselves. On the one hand, this can be done by communicating directly with each other. On the other hand, to reach high availability and partition tolerance, and to get loose coupling between applications, messaging is an appropriate solution.
Therefore, we can choose between multiple products. The Apache Foundation provides ActiveMQ and Kafka, which we’ll compare with each other in this article.
2. General Facts
2.1. Active MQ
Active MQ is one of the traditional message brokers, whose goal is to ensure data exchange between applications in a safe and reliable way. It deals with a small amount of data and is therefore specialized for well-defined message formats and transactional messaging.
We must note that there’s another edition besides this “classic” one: Active MQ Artemis. This next-generation broker is based on HornetQ, whose code was made available to the Apache Foundation by RedHat in 2015. On the Active MQ website, it is said that:
Once Artemis reaches a sufficient level of feature parity with the “Classic” code-base it will become the next major version of ActiveMQ.
So, for the comparison, we need to consider both editions. We’ll differentiate between them by using the terms “Active MQ” and “Artemis”.
2.2. Kafka
In contrast to Active MQ, Kafka is a distributed system meant for processing a huge amount of data. We can use it for traditional messaging as well as:
- website activity tracking
- metrics
- log aggregation
- stream processing
- event sourcing
- commit logs
These requirements have gained great importance with the emergence of typical cloud architectures built using microservices.
2.3. The Role of JMS and the Evolution of Messaging
The Java Message Service (JMS) is a common API for sending and receiving messages within Java EE applications. It is part of the early evolution of messaging systems, and it’s still a standard today. In Jakarta EE, it was adopted as Jakarta Messaging. So, it might be helpful to understand the core concepts:
- a Java-native, but vendor-independent API
- the need for a JCA Resource Adapter to implement the vendor-specific communication protocol
- message destination models:
- Queues (P2P) to ensure message ordering and one-time message processing even in case of multiple consumers
- Topics (PubSub) as an implementation of the Publish-Subscribe Pattern, which means that multiple consumers will receive messages for the duration of their subscription to the topic
- message formats:
- Headers as standardized meta information that the broker deals with (like priority or expiration date)
- Properties as non-standardized meta information that the consumer can use for message processing
- the Body containing the payload – JMS declares five types of messages, but this is only relevant for using the API, not for this comparison
However, the evolution went in an open and independent direction – independent from the platform of the consumer and producer, and independent from the vendors of messaging brokers. There are protocols defining their own destination models:
- AMQP – binary protocol for vendor-independent messaging – uses generic nodes
- MQTT – lightweight binary protocol for embedded systems and IoT – uses topics
- STOMP – a simple text-based protocol that allows messaging even from the browser – uses generic destinations
Another development is the addition of the previously reliable transmission of individual messages (“traditional messaging”) to the processing of large amounts of data according to the “Fire and Forget” principle through the spread of cloud architectures. We can say that the comparison between Active MQ and Kafka is a comparison of exemplary representatives of these two approaches. For example, an alternative to Kafka could be NATS.
3. Comparison
In this section, we’ll compare the most interesting features of architecture and development between Active MQ and Kafka.
3.1. Message Destination Models, Protocols, and APIs
Active MQ fully implements the JMS message destination model of Queues and Topics and maps AMQP, MQTT and STOMP messages to them. For example, a STOMP message is mapped to a JMS BytesMessage within a Topic. Additionally, it supports OpenWire, which allows cross-language access to Active MQ.
Artemis defines its own message destination model independently from the standard APIs and protocols and also needs to map them to this model:
- Messages are sent to an Address, which is given a unique name, a Routing Type, and zero or more Queues.
- A Routing Type determines how messages are routed from an address to the queue(s) bound to that address. There are two types defined:
- ANYCAST: messages are routed to a single queue on the address
- MULTICAST: messages are routed to every queue on the address
Kafka only defines Topics, which consist of multiple Partitions (at least 1) and Replicas that can be placed on different brokers. Finding the optimal strategy for partitioning a topic is a challenge. We must note that:
- One message is distributed into one partition.
- Ordering is only ensured for messages within one partition.
- By default, subsequent messages are distributed round-robin among the topic’s partitions.
- If we use message keys, then messages with the same key will land in the same partition.
Kafka has its own APIs. Although there’s also a Resource Adapter for JMS, we should be aware that the concepts are not fully compatible. AMQP, MQTT, and STOMP are not supported officially, but there are Connectors for AMQP and MQTT.
3.2. Message Format and Processing
Active MQ supports a JMS standard message format consisting of headers, properties, and the body (as described above). The broker has to maintain the delivery state of every message resulting in lower throughput. Since it’s supported by JMS, consumers can synchronously pull messages from the destination, or messages can be asynchronously get pushed by the broker.
Kafka does not define any message format — this is entirely the responsibility of the producer. There isn’t any delivery state per message, just an Offset per consumer and partition. An Offset is the index of the last message delivered. Not only is this faster, but it also allows messages to be re-sent by resetting the offset without having to ask the producer.
3.3. Spring and CDI Integration
JMS is a Java/Jakarta EE standard and, as such, is fully integrated into Java/Jakarta EE applications. So, connections to Active MQ and Artemis are easily managed by the application server. With Artemis, we can even use an embedded broker. For Kafka, managed connections are only available when using the Resource Adapter for JMS or the Eclipse MicroProfile Reactive.
Spring has integration for JMS as well as AMQP, MQTT, and STOMP. Kafka is also supported. With Spring Boot, we can use embedded brokers for Active MQ, Artemis, and Kafka.
4. Use Cases of Active MQ/Artemis and Kafka
The following points give us a direction as to when which product can best be used.
4.1. Use Cases of Active MQ/Artemis
- Process only a small number of messages per day
- High level of reliability and transactionality
- Data transformations on-the-fly, ETL jobs
4.2. Use Cases of Kafka
- Process a high load of data
- Real-time data processing
- Application activity tracking
- Logging and monitoring
- Message delivery without data transformation (it would be possible, but not easy)
- Message delivery without transport guarantees (it would be possible, but not easy)
5. Conclusion
As we have seen, both Active MQ/Artemis and Kafka have their purposes and, therefore, their justifications. It’s important to know their differences in order to choose the right product for the right case. The following table explains these differences again in brief:
Criteria
Active MQ Classic
Active MQ Artemis
Kafka
Use Cases
Traditional Messaging (reliable, transactional)
Distributed Event Streaming
P2P Messaging
Queues
Address with Routing Type ANYCAST
–
PubSub Messaging
Topics
Address with Routing Type MULTICAST
Topics
APIs / Protocols
JMS, AMQP. MQTT, STOMP, OpenWire
Kafka Clients, Connectors for AMQP and MQTT, JMS Resource Adapter
Pull- vs. Push-based Messaging
Push-based
Pull-based
Responsibility for Message Delivery
A producer has to ensure that message is delivered
A consumer consumes the messages that it is supposed to consume
Transaction Support
JMS, XA
Scalability
highly scalable (partitions and replicas)
The more consumers…
… the slower the performance
… does not slow down