Friday, October 25, 2013

A Proposal for Rsyslog State Variables

As was discussed in great lenghth on the rsyslog mailing list in October 2013, global variables as implemented in 7.5.4 and 7.5.5 were no valid solution and have been removed from rsyslog. At least in this post I do not want to summarize all of this - so for the details please read the mailing list archive.

Bottom line: we need some new facility to handle global state and this will be done via state variables. This posting contains a proposal which is meant as basis for discussion. At the time of this writing, nothing has been finalized and the ultimate solution may be totally different. I just keep it as blog posting so that, if necessary and useful, I can update the spec towards the final solution.

Base Assumptions
There are a couple of things that we assume for this design:
  • writing state variables will be a  very infrequent operation during config execution
  • the total number of state variables inside a ruleset is very low
  • reading occurs more often, but still is not a high number (except for read-only ones)
State variables look like the previous global variables (e.g. "$/var" or "$/p!var"), but have different semantics based on the special restrictions given below.

Restrictions provide the correctness predicate under which state vars are to be evaluated. Due to the way the rsyslog engine works (SIMD-like execution, multiple threads), we need to place such restrictions in order to gain good performance and to avoid a prohibitively amount of development work to be done. Note that we could remove some of these restriction, but that would require a lot of development effort and require considerable sponsorship.

  1. state variables are read-only after they have been accessed
    This restriction applies on a source statement level. For example,
    set $/v = $/v + 1;
    is considered as one access, because both the read and the write is done on the same source statement (but it is not atomic, see restriction #3). However,
    if $/v == 10 then
        set $/v = 0;
    is not considered as one access, because there are two statements involved (the if-condition checker and the assignment).
    This rule requires some understanding of the underlying RainerScript grammar and as such may be a bit hard to interpret. As a general rule of thumb, one can set that using a state variable on the left-hand side of a set statement is the only safe time when this is to be considered "one access".
  2. state variables do not support subtree access
    This means you can only access individual variables, not full trees. For example, accessing $/p!var is possible, but accessing $/p will lead to a runtime error.
  3. state variables are racy between diffrent threads except if special update functions are use
    State variable operations are not done atomic by default. Do not mistake the one-time source level access with atomicity:
    set $/v = $/v + 1;
    is one access, but it is NOT done atomically. For example, if two threads are running and $/v has the value 1 before this statement, the outcome after executing both threads can be either 2 or 3. If this is unacceptable, special functions which guarantee atomic updates are required (those are spec'ed under "new functions" below).
  4. Read-only state variables can only be set once during rsyslog lifetimeThey exist as an optimization to support the common  usecase of setting some config values via state vars (like server or email addresses).
Basic Implementation Idea
State variable are implemented by both a global state as well as message-local shadow variables. The global state holds all state variables, whereas the shadow variables contain only those variables that were already accessed for the message in question. Shadow variables, if they exist, always take precedency in variable access and are automatically created on first access.

As a fine detail, note that we do not implement state vars via the "usual" JSON method, as this would both require more complex code and may cause performance problems. This is the reason for the "no-subtree" restriction.

Data Structure
State vars will be held in a hash table. They need some decoration that is not required for the other vars. Roughly, a state var will be described by
  • name
  • value
  • attributes
    • read-only
    • modifyable
    • updated (temporary work flag, may be solved differently)
This is pseudo-code for the rule engine. The engine executes statement by statement.

for each statement:
   for each state var read:
      if is read-only:
         return var from global pool
         if is in shadow:
             return var from shadow pool
             read from global pool
             add to shadow pool, flag as modifiable
             return value
    for each state var written:
        if is read-only or already modified:
           emit error message, done
        if is not in shadow:
            read from global pool
            add to shadow pool, flag as modifiable
        if not modifieable:
           emit error message, done
        modify shadow value, flag as modified

at end of each statement:
   scan shadow var pool for updated ones:
       propagate update to global var space
       reset modified flag, set to non-modifiable
  for all shadow vars:
       reset modifiable flag

Note: the "scanning" can probably done much easier if we assume that the var can only be updated once per statement. But this may not be the case due to new (atomic) functions, so I just kept the generic idea, which may be refined in the actual implementation (probably a single pointer access may be sufficient to do the "scan").

The lifetime of the shadow variable pool equals the message lifetime. It is destrcuted only when the message is destructed.

New Statements
In order to provide some optimizations, new statements are useful. They are not strictly required if the functionality is not to be implemented (initially).
  • setonce $/v = value;
    sets read-only variable; fails if var already exists.
  • eval expr;
    evaluates an expression without the need to assign the result value. This is useful to make atomic functions (see "New Functions") more performant (as they don't necessarily need to store their result).
New Functions
Again, these offer desirable new functionality, but can be left out without breaking the rest of the system. But they are necessary to avoid cross-thread races.
  • atomic_add(var, value)
    atomically adds "value" to "var". Works with state variables only (as it makes no sense for any others).
This concludes the my current state of thinking. Comments, both here or on the rsyslog mailing list, are appreciated. This posting will be updated as need arises and no history kept (except where vitally important).

Special thanks to Pavel Levshin and David Lang for their contributions to this work!

Friday, October 11, 2013

rsyslog's imudp now multithreaded

Rsyslog is heavily threaded to fully utilize modern multi-core processors. However, the imudp module did so far work on a single thread. We always considered this appropriate and no problem, because the module basically pulls data off the OS receive buffers and injects them into rsyslog's internal queues. However, some folks expressed the desire to have multiple receiver threads and there were also some reports that imudp ran close to 100% cpu in some installations.

So starting with 7.5.5, imudp itself supports multiple receiver threads. The default is to use a single thread as usual, but via the "threads" module parameter, up to 32 receiver threads can be configured. We introduced this limit to prevent naive users from totally overruning their system capability - spawning a myriad of threads usually is quite counter-productive (especially when they outnumber the available processor cores). For the same reason, I would strongly suggest that the number of threads is only increased if there is some evidence for this to be useful -- which usually means the imudp thread should require considerable CPU time. In order to aid the decision, I have also added new rsyslog statistics counters which permit monitoring of the worker thread activity.

We will now evaluate practical feedback from the new feature. One of the goals of this new enhancement is to limit the risk of UDP message loss due to buffer overrun, which we hope we have improved even without the need to select realtime priority.

Please note that 7.5.5 is at the time of this writing not yet released, so for the next couple of days the new feature is only available via building from the git master branch.

Friday, October 04, 2013

New Queue Defaults in rsyslog 7.5

As regular readers of my blog know, we are moving towards preferring enterprise needs vs. low-end system needs in rsyslog. This is part of the changes in the logging world induced by systemd journal (the full story can be found here).

Many of the main queue and ruleset queue default parameters were a compromise, and much more in favor of low-end systems than enterprises. Most importantly, the queue sizes were very small, done so in an approach to save virtual memory space. With the 7.5.4 release, this will change. While the default size was 10,000 msgs so far, it has been increased 10-fold to 100,000. The main reason is that the inputs nowadays batch together quite some messages, which gives us very good performance on busy systems. It is not uncommon that e.g. the tcp input submits 2,000 messages as once. With the previous defaults, that meant the main queue could hold 5 such submission. Now we got much more head room.

Note that some other changes were made alongside. The dequeue batch size has been increased to 256 from the previous value of 32. The max number of worker threads has been increased to two, removing our previous conservative setting of one. At the same time, we now require at least 40,000 messages to be inside the queue before the second worker is activated. So this will only happen on very busy systems. Note that the previous value of 100 messages was really an artifact of long gone-away times and usually meant immediate activate of the maximum number of workers, what was quite contrary to the intention of that parameter.

Of course, these are just changed defaults. They can always be overridden by explicit settings. For those configs that already did this, nothing changes at all.