Wednesday, June 09, 2010

what are actions and action instance data?

On the rsyslog mailing list, the question about what actions are in in which way they are kept single-threaded from the POV of the output module came up again. I try to summarize the most important points and term here.

David Lang gave the following example configuration:

*.* file1
*.* file2
*.* @ip1
*.* @ip2
*.* @@ip3
*.* @@ip4

and asked how many different actions/entities that were. Here is my answer:

An *action* is a specific instance of some desired output. The actual processing carried out is NOT termed "action", even though one could easily do so. I have to admit I have not defined any term for that. So let's call this processing. That actual processing is carried out by the output module (and the really bad thing is that the entry point is named "doAction", which somewhat implies that the output module is called the action, what is not the case).

Each action can use the service of exactly one output module. Each output module can provide services to many actions. So we have a N:1 relationship between actions and output modules.

In the above samples, 3 output modules are involved, where each output module is used by two actions. We have 6 actions, and so we have 6 action locks.

So the output module interface does not serialize access to the output module, but rather to the action instance. All action-specific data is kept in a separate, per-action data structure and passed into the output module at the time the doAction call is made. The output module can modify all of this instance data as if it were running on a single thread. HOWEVER, any global data items (in short: everything not inside the action instance data) is *not* synchronized by the rsyslog core. The output module must take care itself of synchronization if it desires to have concurrent access to such data items. All current output modules do NOT access global data other than for config parsing (which is serial and single-threaded by nature).

Note that the consistency of the action instance data is guarded by the rsyslog core by actually running the output module processing on a single thread *for that action*. But the output module code itself may be called concurrently if more than one action uses the same output module. That is a typical case. If so, each of the concurrently running instances receives its private instance data pointer but shares everything else.

No comments:

The clang thread sanitizer

Finding threading bugs is hard. Clang thread sanitizer makes it easier. The thread sanitizer instruments the to-be-tested code and emits u...