Understanding the Kafka Async Producer

We’ve been talking recently about our use of Apache Kafka–a system that takes a different view of implementing messaging. Because we’re using Kafka in a high-volume area of our system, we’ve had to spend time understanding how to get messages through reliably at high rates. While we’ve spent time on both the producing and consuming side of the brokers, this post focuses on the producing side only.

At a high level Kafka offers two types of producers–synchronous and asynchronous. As a user of the producer API, you’re always interacting with the kafka.producer.Producer class, but it’s doing a fair bit for you under-the-covers depending on how you’ve configured it.

By default the Producer uses SyncProducers underneath. SyncProducer does what it sounds like–it sends the messages to the broker synchronously on the thread that calls it. In most high volume environments this obviously isn’t going to suffice. It’s too many network sends on too many small-ish messages.

By setting the producer.type flag to ‘async’, the Producer class uses AsyncProducers under the covers. AsyncProducer offers the capability not only to do the sends on a separate thread, but to batch multiple messages together into a single send. Both characteristics are generally desirable–isolating network I/O from the threads doing computation and reducing the number of network messages into a smaller number of larger sends. Introductory information on the producers can be found here.

An important note is that Kafka supports compressing messages transparently such that neither your producer or consumer code has to do anything special. When configured for compression (via the compression.codec producer config setting) Kafka automatically compresses the messages before sending them to the broker. Importantly, this compression happens on the same thread that the send occurs.

Because the compress and send are both relatively time consuming operations, it’s an obvious place you may want to parallelize. But how parallel is the Producer?

A producer configured for async mode creates a separate underlying AsyncProducer instance for each broker via its internal ProducerPool class. Each AsyncProducer has its own associated background thread (named ProducerSendThread) that does the work for the send. In other words, there’s one thread per broker so your resulting parallelism for compressing and sending messages is based on the number of brokers in the cluster.

Here’s a snippet of the threads in an application with a single broker:

AsyncProducer

And here’s what they look like under the same producer load with 3 brokers in the cluster:

AsyncProducer

In the first image you can see that the single thread is doing more work than when it’s spread out, as you would expect.

A common question is “Why am I seeing QueueFullExceptions from the producer”. This often means the ProducerSendThreads are saturated and unable to keep up with the rate of messages going into the Producer. An obvious approach to deal with this is adding brokers to the cluster and thus spreading the message sends across more threads, as well as lightening the load on each broker.

Adding brokers to the cluster takes more consideration than just how efficient the producers are. Clearly there are other approaches you could take such as building a different async model on top of the SyncProducer that’s not dependent on the number of brokers. But it’s useful to know what the options are, and in the end adding brokers may be useful for your architecture in other ways.

If the topology of your producer applications is such that there are many applications producing a relatively small number of messages then none of this may be of concern. But if you have a relatively small number of producer applications sending many large messages, understanding how to spread out that work can be of real importance.