Wednesday, June 13, 2012

Using ElasticSearch Bulk Mode with rsyslog

Rsyslog's omelasticsearch plugin now supports bulk mode. With bulk mode, message processing is much faster, especially if large loads are to be processed.

Bulk mode works with rsyslog's batching capabilities. So it probably is a good idea to refresh some of the batching concepts. The core idea is that while we would like to process many messages at once, we do NOT want to wait hold processing messages "just" because they are too few. So with batching, you set an upper limit on the batch size (number of messages inside a batch). Let's say the batch size is set to 32. When a new batch is to be processed, the queue worker tries to pull 32 messages off the queue. If there are 32 or more present, this is nice and all 32 are taken from the queue. But now let's assume there are only 10 messages at all present inside the queue. In that case, the queue worker does not try to guess when the next 22 messages will arrive and wait for that (if the time is short enough). Instead, it just pulls the 10 already-present messages off the queue and these form the batch. When new messages arrive, they will be part of the next batch.

Now let's look at the startup of a busy system. Lot's of messages come in. Let's assume they are submitted one-by-one (many sources submit multiple messages, but let's focus on those cases that do not). If so, the first message is submitted and the queue worker is activated. Assuming this happens immediately and before any other message is submitted (actually unlikely!), it will initially create a batch of exactly one message and process that. In the mean time, more messages arrive and the queue fills. So when the first batch is completed, there are ample messages inside the queue. As such, the queue worker will pull the next set of 32 messages off the queue and form a new batch out of them. This continues as long as there are sufficient messages. Note that in practice the first batch will usually be larger than one and often be the max batch size, thanks to various inner workings I would not like to elaborate on in detail in this article. Large batch sizes with more than 1024 messages are not bad at all and may even help improve performance. When looking at a system performing with such large batches, you will likely see that partial batches are being created, simply for the reason that the queue does not contain more messages. This is not an indicator for a problem but shows that everything works perfectly!


The max batch size can be configured via

$ActionQueueDequeueBatchSize
and
$MainMsgQueueDequeueBatchSize

Note that the default sizes are very conservative (read: low), so you probably want to adjust them to some higher value. The best value depends on your workload, but 256 is probably a good starting point. If the action queue runs asynchronously (e.g. linkedlist mode, everything non-direct), the action queue batch size specifies the upper limit for the elasticsearch bulk submission.

To activate bulk mode, use

*.*     action(type="omelasticsearch"
           ... other params ...
           bulkmode="on")

The default is the more conservative "off". Note that the action can of course be used with any type of filter, not just the catch-all "*.*". This is only used as a sample.

No comments:

Busy at the moment...

Some might have noticed that I am not as active as usual on the rsyslog project . As this seems to turn out to keep at least for the upcomi...