next round of performance enhancement in rsyslog

Today, I made a very important change to rsyslog: rulesets now can have their own “main” queue. This doesn’t sound too exciting, but can offer dramatic performance improvements.

When rsyslog was initially created, it followed the idea that messages must be processed in the order they were received. To facilitate that, all inputs submitted message to a single main message queue, off from which the processing took place. So messages stayed in reception order. … Well, actually they stayed in “enqueued order”, because it depended on the OS scheduler if input modules could really enqueue in the order they received. If, for example, input A received two messages, but was preempted by module B’s message reception, B’s data could hit the queue earlier than A’s. As rsyslog supported more and more concurrency, the order of messages did become ever less important. The real cure for ordered delivery is to look at high-precision timestamps and build the sort order based on them (in the external log analyzer/viewer).

So, in essence, reception order never has worked well and the requirement to try keep it has long been dropped. That also removed one important reason for the single main message queue. Still, it is convenient to have a single queue, as its parameters can be set once and for all.

But a single queue limits concurrency. In the parallel processing world, we try to partition the input data as much as possible so that the processing elements can independently work on the data partitions. All data received by a single input is a natural data partition. But the single main queue merged all these partitions again, and caused performance bottlenecks via lock contention. “Lock contention”, in simple words, means that threads needed to wait for exclusive access to the queue.

This has now been solved. Today, I created the ability to create ruleset-specific queues. In rsyslog, the user can decide which ruleset is bound to which inputs. For a highly parallel setup, each input should have its own ruleset and each ruleset should have defined its own “main” queue. In that setting, inputs do no longer block each other during queue access. On a busy system with many inputs, the results can be dramatic. And as more as a side-effect, each ruleset is now processed by its dedicated rule processing thread, totally independent from each other.

This design offers a lot of flexibility. But that is not enough. The next step I plan to do is to create the ability to submit a message to a different ruleset during processing. That way, hierarchies of rulesets can be created, and these rulesets can even be executed via separate thread pools, with different queue parameters and in full concurrency. And the best is that I currently think it will not be very hard to create the missing glue.

The only really bad thing is that the current configuration language is really not well-suited to handle that complexity (“really not” is not a type for “not really”…). But I have no alternative than to take this route again, until I finally find time to create a new config language. The only good thing is that I get better and better understanding of what this new language must be able to do, and it looks that my initial thoughts were not up to what now is required…