Friday, January 30, 2009

Wanna play? No, says the DRM!

Do you like DRM? Isn't that a perfect thing to make sure you are properly licensed with all your music, movies and, of course, software? Well, folks like the EFF have strongly opposed DRM right from the beginning. One of their arguments always has been that, if thought to the end, would revoke the user the ability to work do with his machine what he wants do.

Now we see a perfect sample. Grave Rose just posted a nice link on twitter: "Gears of War DRM screwup makes PC version unplayable". It's all about a DRM cert that seems to have expired with the end result that the game no longer works. Thankfully, as we do not (yet) have the full trusted computing platform in place. So, you still change you PC. This enabled users to set back their system clocks and so the game worked again. rofl...

Granted, this is not a real DRM issue. Such an expiration date can be encoded in software ever since. With a good debugger, it is not too hard to remove it (of course, that's not legal and with DRM it is considerably more work to do...). But if we are forced to use more and more DRM and if we are forced to use hardware platforms that deny true admin access to its owner and we have legislation that outlaws helping yourself - won't those issues become the norm.

For most of the time, you could rest assured that once you had installed something, and did not change it, it was likely to run for eternity (well... somewhat). This seems no longer to hold true. The only true solution is to use as much open source as possible and say no to any DRM-enabled products.

As an interesting side-note, I am not sure if the poor gamers who set back their system clocks are in legal troubles: didn't they try to circumvent a technical copy protection? Not sure about the DMCA, but in Germany you could argue that this is an illegal attack... Happy gaming!

rsyslog: optimizing exception handling

The recent analysis of rsyslog's race condition has fueled some related and some not-so-related discussions. Among them is an old-time favorite, that is performance enhancement. I have finally taken the time to write about rsyslog's "exception handling" and what I do not like about it.

I am reproducing a forum post here, in the hopes that it will be easier to find - and attract more attention - if it is available via the blog. Comments I would appreciate via the forum, so that I can keep track of them in a single location. With that said, here we go:

In rsyslog, a kind of exception handling is done by the "iRet" mechanism. In short, there exists an integer data type that conveys a universal return code. This code ranges from "all OK" over "all OK, but this and that information", "we had a warning" to "something went wrong". States are encoded as integer numbers. By calling convention, almost all functions return such an iRet value (named after its variable name). More importantly, every caller checks the outcome and employs a kind of exception handling when something unexpected happened (like doing resource cleanup). As an aid to the developer, most of the inner workings are encapsulated in easy to use macros.

For example, the return code checking is done via the CHKiRet(f(x)) macro, which expands to something like
if((iRet = f(x)) != RS_RET_OK)
goto abort_finalize;
As such, the innocent-looking (and frequently found )sequence
CHKiRet(f(x));
CHKiRet(g(x));
CHKiRet(h(x));
results in lots of conditional branches. Such code places a big burden on a CPU's speculative execution resources. For example, it may need a lot of space in the branch pattern table, ejecting other, potentially useful entries from the cache. Given the fact that the quality of speculative execution affects execution speed considerably on modern CPUs, pressing the speculative system to its max is probably not a wise idea.

One performance enhancement approach is to find ways that enable the code to be executed in larger linear blocks. The most important observation is that in almost all cases, the if() condition is never true, because typically the outcome of the function called is an OK state.

I thought about using longjmp to provide the necessary functionality, but the setup effort for longjmp, on *quick* lock, seems to be too high, especially in the case of the number of small functions that are present in rsyslog (and inlinening does not help with this issue). The answer is probably too look at how the C++ exception mechanism is implemented and build a solution similar to that (just like many of the object callbacks are inspired by the C++ method call tables).

I have not yet begun to dig seriously into this optimization, as there are plenty of other things that can be improved and that promise to have much more effect (like the reduction of the overall number of system calls needed on a per message basis).

However, I would appreciate feedback on this issue. Please post to the forum thread, so that I have the information at hand when I finally can turn to optimizing that code area.

Thursday, January 29, 2009

Some more on the rsyslog data race...

After I had written my writeup on the rsyslog data race, I had another very useful discussion with Lorenzo Catucci. We touched some points that are not yet included in the write-up. I also explained some of the rsyslog internals that you probably find no other source yet. In short: I think this discussion is useful for anyone interested in more details. I am not sure when (if) I will be able to include it into the original post. So rather than lose it, I asked Lorenzo if he agreed to making it public and thankfully he did.

So here it is - very authentic and totally unedited, with all the typos and sometimes non-sequence of a chat discussion as we all like it. Have fun! Comments are appreciated.

Rainer:
writeup finally finished: http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html ;)
catucci: very nice wrap-up.
Rainer: the really good thing is that it actually sound we found the bug :)
now lets hope for the best...
catucci: I think you should emphasise a little bit more the need for coherent handling of condition variables or the like: now I understand in what sense you talked about an existing memory barrier this morning; the problem with that "barrier" was that it wasn't enabled on both sides... something like a water valve...
Rainer: good point, will see that I edit it in
but with the barrier, I actually meant the real thing: http://en.wikipedia.org/wiki/Memory_barrier
it forces the cpu to in-order memory operations
I thought it is sufficient.. but obviously not
I didn't have the exact failure scenario on my mind this morning
that evolved during the analysis
and was clarified by the write-up
in fact I started a new lab in the midst of the writeup
which again proves the point that it pays to write things down
catucci: Yes, but what I was thinking was the constructor side missing the ordering.
Rainer: it's even more complicated
I didn't explain that all
rsyslog shall run great on uniprocessor
catucci: will wait for the next version of the write-up!!! .-)
Rainer: under some constraints, it is possible to run without any sync at all
catucci: :-)
Rainer: (I wont add this, it is too much in depth)
so the message mutex is enabled depending on whether it is needed or not
similarly, when the message is constructed, it is done in a single thread
the mutex is disabled while doing so
only after it is constructed, the object is finalized, which then (and only then) enables the mutex
so it is ok that part of the objhects lifetime it is not synchronized
maybe never, depending on the config
that gains quite some performance
even on MP, because of the construction phase
of course, it doesn't ease troubleshooting...
catucci: Yes, as usual, I've been too fast in both thinking and writing... what I was talking about was the enqueueing and not the real building.
Rainer: yeah, that's right
the message is first finaliuzed, then it is enqueued
from then on, the sequence is always the same (there is only a single call to do so)
well... is it.. good point
will check reality vs. theory tomorrow ;)
that's why I love discussion :)
catucci: good, see you.
Rainer: cu - have a great evening!
Sent at 6:49 PM on Wednesday
rgerhards: good morning - what's your lab doing? ;)
Sent at 9:47 AM on Thursday
catucci: still running well.
remember: with -d option active.
rgerhards: sounds good. mine was running well, too
yeah... verification will be a long road
at some point we should switch back to production code
catucci: Just another small note on your weblog entry:
rgerhards: be happy to hear it
catucci: you some to have missed a very important ingredient of my mix: a very high load on the system; sar -q gives me these figures:
rgerhards: oh yeah
I missed to write that down
should add it
catucci: 07:15:01 runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
09:45:01 0 216 0.70 0.59 0.49
rgerhards: all in all, now that I (think) I know the failure scenario
I know that it is only probable under high load
catucci: this is importanto, since with big plists, it's much more probable for the threads to be preempted by the scheduler, and to jump to other packages
rgerhards: and even then it is so unlikely to happen that you need an awful lot of iterations, what also requires high load
indeed
catucci: you know, I'm a bit old school, and I'm used to talk about packages for multi core...
rgerhards: so the ingredient is to have a fast machine, many cores, run them utilized, but in a way that as few as possible preemption of the rsyslog threads happen
do you concur to this POV?
Sent at 9:54 AM on Thursday
catucci: Almost completely: an high contect switch rate is needed even between the syslog threads IMHO: if the threads do just sleep undisturbed on waits/pools/selects, the probability of the race happening just go infinitesimal. to get to the race you (almost) need a forced context switch (I think)
rgerhards: now I agree mostly... I think the context switch itself is not necessary. But it is necessary that access to the reference counter happens concurrently.
catucci: I no loger own the elder two Pentium-III system that has been my workstation for the last ten years, but I'd bet you could just as well see the races there; maybe in a shorter time than on my mx-01
rgerhards: from my POV, this can happen either due to context switches OR due to two threads being executed on two different execution units right at the same time. Actually, I think the later easier leads to the failure scenario.
catucci: yes, that's very true; the high context switch rate does more to probabilistically spread the syslog run between different packages
rgerhards: I think the key is to understand the scenario. I completely agree that now that we have details, we can craft lab environments that will reproduce the issue on not-so-powerful hardware. One needs to know the ingredients.
catucci: remember that multi-core packages usually have a shared cache between the cores in the same package
rgerhards: For example, I did one lab where I increased the concurrency of rsyslog actions. That lead to more context switches, more mutex calles and lead to no repro at all.
This was overdone. If I reduced the concurrency, I was again able to reproduce. For every machine, there seems to be a mix that probably triggerst the failure scenario. I think this was one of the reasons why it was so hard to find. Not knowing the root cause, the bug appeared very random and was extremely hard to track down to a config/environment.
BTW: do you mind if I post a copy of this conversation on the blog? I will probably not be able to update the analysis today, but our conversation should be very helpful to everyone interested.
catucci: In that lab the queuer and the dequeuer did share the same mutex opinion? The key ingredient seems to be using a two different syncronization primitives on the two sides
rgerhards: yes and now - in action execution there are two mutex types involved, each with multiple instances
catucci: No problem at all for publishing
rgerhards: publishing: thx
there is one mutex that guards each individual action. No action (output module) may be called concurrently - that's part of the module spec (for good reason with many modules)
so there is one mutex for each action and it is aquired before an action is entered and released when it is done.
The other mutex class guards each individual message. that is the one described in the analysis.
if two different action invocations process a message, they may concurrently try to modify the message (that depends on the actual config, most importantly on the templates used and order of action execution)
So any such modifcation - AFTER the message has been finalized (think yesterday evening) is guarded by the message mutex.
catucci: right; I was thinking about the "unlocked" case in MsgAddRef's# ifdef xHAVE_ATOMIC_BUILTINS case
rgerhards: if you have n actions, a single message may therotically in use n times
(side-note: if n is 1, it can not be used multiple times, and this is - I think - one of the cases where the message mutex is "turned off")
I needed to set stage ;)
with that design, all actions use exactly the same sequence to increment and decrement the reference counter.
so this will always be used in a consistent manner. That part of the logic was absolutely right. BUT.... as I wrote, the msgAddRef used atomics, while msgDestruct used the message mutex.
catucci: Right: this is the problem (IMHO) the mutex and the atomics where not coherent.
rgerhards: so while the sequence of calls to msgAddRef and msgDestruct was right and consistent, the operations carried out by these calls were inconsistent
yes, exactly
catucci: there should be only one kind of sync primitive in use, be it atomics or mutexes
rgerhards: thus there are two fixes: either use the mutex exlusively (that's running in your lab) or use the atomics exlusively (I've run that in my lab since around noon yesterday)
exactly!
and this is what the final fix will look like: it'll check if atomics are avaiable and, if so, use them. If they are unavailable, we will use the mutex (which is much slower)
catucci: yes, this is what I think too. and I'm happy we have an experimental confirmation!
rgerhards: indeed
The problem is that we for now have just a indication, I wouldn't say its a confirmation yet.
catucci: what branch should I download to test the atimics only?
rgerhards: But it is a pretty good indication.
catucci: you know... scince is made of falsifications...
rgerhards: I have no branch yet to test with. What runs in my lab is a heavily hacked version, quite ugly
indeed
my point is we do not have enough proof yet
but... we have a sound theory
plus we know for sure (as we can show a failure case) that this was a bug
catucci: right.
rgerhards: we do not know for sure if it was the only one. But we now it was one
so code has improved
catucci: right too.
rgerhards: and both of our lab results show stability has increased, at worst only in some scenarios
in that sense, however, theory and experiment go together
and this indeed makes me feel good :)
I'll see that I get a new master branch together today that includes everything needed to address the failure scenario AND enabling atomics again.
I need to be careful as the atomics version is ugly and I need to verify I do everything needed in a clean version, too.
It would be excellent if you could then run that new master branch. And then we see what happens. I think a debug build would be good at first (as this was the repro environment) and once this works out well (a day or two without aborts should be sufficent), we can switch the production build.
In parall, I hope that others who experienced the problem will also try the new code.
so over the next couple of days, maybe weeks, we should get a good feeling if there still is an issue.
I have to admit it is quite complex code, so I wouldn't outrule there is something else...
catucci: Your are right especially on the other's systems. Maybe ew could try to get in touch with the other debian users experiencing the crash
rgerhards: definitely. There is also an entry in debian's bug tracking system. I already did a patch for the version used by debian lenny and michael Biebl, the package maintainer, did an updated package this night.
so I think we should get some feedback from there, too. But the problem is that the problem occured very sporadic in that case. I'd say anything less than two weeks of undisturbed operation is not even an indication
I will also ask the folks on the rsyslog mailing list to apply patches .. as soon as new versions are out.
catucci: good. see you.

Wednesday, January 28, 2009

rsyslog data race analysis

The good news first: it looks like I have found at least one cause of the race condition that plagued rsyslog for a while. On some fast machines, we saw aborts in certain configurations. Usually, this happened to occur on machines with 4+ cores only, and the best reproduction could be achieved on a "two four-core" system (that's an 8-way system, but you'll later see why I emphasize the fact that it is not 8 cores on a single chip but two chips with 4 cores each). Special thanks go to Lorenzo Catucci of the University of Rome "Tor Vergata" - he contributed not only an excellent test environment, but offered good ideas in our discussions.

I can not say if what I found was the only problem, but for sure I can say it was one problem. Testing also seems to show that the bug has disappeared. The good thing is that what happened matches exactly what can theoretically happen with this kind of bug.

Of course, it all boils down to thread synchronization and the way memory is accessed in a modern processor. In order to explain, I need to set the stage first.

In rsyslog, syslog messages are kept inside so-called message "objects". As rsyslog is coded in C (not C++), we do not have real objects, but the hand-written runtime brings much of the benefits of real objects. So let's stick with the term "object" for the rest of this post. Rsyslog is designed to work well on tomorrow's massively-parallel machines. As such, it not only runs on multiple threads, but has abilities to shuffle messages to various queues for execution. Most importantly, actions (like writing to a file or forwarding to a central server) can be enqueued in asynchronously executed queues. When doing so, a single message object is potentially "stored" in multiple queues at the same time. In order to gain performance, rsyslog does not actually copy message objects (as long as they are not written to), but rather keeps references to the very same object. However, we also need to know when we can cleanup messages, so we need to know when the last owner has ceased control of it (no, no Java-like garbage collection...). So what we do is reference-count message objects. Whenever a new "owner" receives a reference to the object, a reference counter is incremented (done via MsgAddRef()). Whenever the owner no longer needs the object, it is decremented (done via MsgDestruct()). If MsgDestruct() detects that the reference counter dropped to zero, nobody is using the message object any longer. MsgDestruct then immediately destructs the object (hence the name). This is done as part of the MsgDestruct call and invisible to the caller.

So far, so good. No issue at all without threading. With multiple threads, we need to make sure that there is no race for the variable that holds the reference counter. The variable itself is a C "int", thus usually a 32 bit value.

To simplify things, let's for now assume we execute on Intel processors. This is a good assumption in many cases. Also, it is the environment in which we found the bug, so I have details for it. Note that in-depth documentation on Intel's processors is available via the Intel web site[1].

According to Intel's System Programming Guide[3], section 7.1.1, memory access to words aligned on 16 bit boundary will always be atomic. I had this on my mind when I wrote the initial reference counting code - and made a few (now educating) mistakes.

First of all, I had the somewhat old-time thinking of a single CPU without caches on my mind. That read: the increment and decrement operations work on properly aligned words, so they are atomic. Consequently, I do not need to guard them against races. I should have paid more attention to the advances in processor design. Modern CPUs have caches. Depending on how data is accessed, the cache may be used - or not. Even more important, with multiple cache levels and multiple execution units (multi-core CPUs), cache coherency becomes an issue. One must ensure that if one thread, executing on Unit A updates a value, and another one executing concurrently on Unit B updates the exact same value, nothing will be overwritten. This becomes even more problematic if the two execution units do not share the same cache subsystem (think of the two quad-core system). Consequently, Intel specifies in 7.1.1:

Accesses to cacheable memory that are split across bus widths, cache lines, and page boundaries are not guaranteed to be atomic...
So here we go, my basic assumption may be right on some classes of machines, but it is definitely wrong on others (newer ones). So aligned word-access does not guarantee (not even enhance the chance) of atomicity. [It is an interesting question whether I made invalid assumptions about the Intel microarchitecture or if Intel slightly changed the instruction set architecture at some time - I wouldn't outrule the later, but having a wrong assumption is probably the real issue; especially as I knew I should not rely on anything in regard to synchronized memory access if not explicitly specified).

Let me add that true atomicity (probably at a high performance penalty) can be achieved by specific instrucion prefixes, the "bus lock" instruction. This is detailed in section 7.1.2. of Intel's guide. Note that atomicity is only guaranteed if the design of the memory subsystem pays proper attention to the LOCK# signal, which we hopefully can assume on decent hardware. The GNU C compiler supports generating the necessary prefixes for those instructions that support tehm (few) as part of its atomic operations support.

So error#1 was to assume that access to properly-aligned word-data is atomic. It is not. I knew it, you know it; abide to it or you'll run into the same troubles.

As I said, I knew it. The bad thing is that I partly fixed the situation in a later version, after I had introduced support for GCC's atomic operations. However, another problem prevented that from working. But more on this later. I am not yet done with the initial problem.

Another, really dumb, error was that I assumed the following sequence to be a single memory operation:
currRefCount = --pThis->iRefCount;
This is really a beginner's error - I should be over that for a couple of years now... What I do is increment a word (OK, not looking at error #1 I thought this is "sufficiently atomic") and then store the result to another variable. Probably my fault was caused by an atomic instruction, supported by Intel processors, which actually does this in an atomic way. But that is far different from the code I wrote.

Let's have a look at which code GCC generates for this instruction (sequence):
3894: 8b 43 48              mov    0x48(%rbx),%eax
3897: 83 e8 01 sub $0x1,%eax
389a: 89 43 48 mov %eax,0x48(%rbx)
389d: 89 44 24 08 mov %eax,0x8(%rsp)
At address 3894 loads the value of pThis->iRefCount into register EAX. Then, it decrements the register (in 3897) and does two stores: one to save it back into the original variable (at 389a) and one to save it to currRefCount (in 389d). As this is a 4-instruction code sequence, there is no way this can be an atomic operation.

If, for example, a second CPU modifies pThis->iRefCount after the mov instruction at 3894 has been carried out, the results will be unpredictable at best. The following may now happen (and I think it has happened):

  1. thread A enters the above code in MsgDestruct() and executes mov at 3894
  2. thread B (on another CPU) increments pThis->iRefCount at the same time that thread A executes instruction 3897
  3. thread A continues execution with instruction 389a. Note that the increment done by thread B is simply overwritten - the reference counter will be one less than it should be
At the time this instruction sequence is executed, no problem appears. But it sets stage for a problem that may follow much later. At some time later, MsgDestruct() is called and the reference counter drops to zero, so the message object is destructed. However, as the counter was one off, the message object is still being used. Still, no harm is done unless either rsyslogd reduces its process space (very unlikely) or the memory is immediately overwritten (unlikely). If none of this happens (likely), the message will be processed, and not be freeed, by virtue of 0 -1 being not equal 0 - a double free will only happen if it equals zero! So in many cases, not even a double free happens! It looks like everything works perfect.

This is probably one reason why the bug was so hard to reproduce: there are a couple of things that need to come together to make it visible. Thankfully, I was smart enough to put a large number of runtime checks into the code and thankfully the bug appeared on some systems even with these runtime checks enabled. So they could detect at least some types of errors and that finally lead us onto the right route.

The other segfaults most probably happened by one of the unlikely or very unlikely cases. But if you process millions and millions of messages, the probability of such an incident increases. No wonder we did never see it on a low-volume system.

In any case, error #2 was to assume too much about code generation without checking (you do not need to check every code generated, but if you intend to do "dirty tricks", you better check at least the primary targets, knowing that this doesn't cover you in all cases).

And now to my code changes that should handle these situations. Please note that at this time I thought I was addressing a hypothetical issue which would never occur in practice (but I was at least brave enough to try to do "the right thing").

I used GCC's atomic operations to replace both the increment and the above decrement construct. Note that with these changes, the code is supposed to work. To understand this, we need to look a bit at sequence of code flow. Reference counting can only work if the reference count is never allowed to drop to zero as long as someone still needs an object. A typical programming error, if the reference count is at one, is code things like this:
msgDestruct(pMsg);

AddToQueue(msgAddRef(pMsg));
That code would first drop the reference counter to zero and thus destruct the object. However, the msgAddRef would then increment it, but work on the now-unallocated memory (which of course still holds all previous values in almost all cases. The proper sequence to avoid this kind of bug is
AddToQueue(msgAddRef(pMsg));

msgDestruct(pMsg);
which first increments the reference counter and then frees the object as far as the current activity is concerned. Thankfully, this coding error I did not commit.

Instead, a made another really dumb one: I crafted perfectly well code, and I even included conditional compilation to use locking in case atomic operations are not available. I used the variable DO_HAVE_ATOMICS to differentiate between the two cases. If only I had used the same variable inside my ./configure check... But there I used the name HAVE_ATOMIC_BUILTINS! Generally, that was a good idea, because it is following the naming convention all other such switches have. If just I had changed the .c files... And finally, I did not remove the interim code, which undid some of the atomics because they were not portable. In the end result, I had what looks like perfectly valid code, but it was never actually compiled. In other words: I totally screwed up.

The result was no improvement after all. I should have noted that testing did not reveal any of these issues, as I thought I address an issue which does not really exist. Most importantly, I had not test case to verify the code. I now know that I should at least have made sure that the conditional compilation worked properly.

So error #3 probably was not to put enough care into debugging conditional compilation.

Then, few isolated reports came in. Far too few to see any trend. At some point, I fixed the switch name error, but not consistently. For some reason (I do no longer remember), I thought that there may be a problem with the atomics and I also saw that there were probably an issue with msgDestruct(). I then commented out the atomic calls and used mutex calls instead. Around the same time, I fixed the handling of atomics in msgAddRef() - at least it looks so [I have not tried to extract the full history]. Had I just done one of these two things, everything would have started to work. But now these two actions were contradictory, and I didn't notice that.

What now happend was the msgAddRef was indeed using atomics. However, msgDestruct used the mutex. As a result, msgDestruct used the non-atomic instruction sequence shown at the beginning of this post to handle the reference count. If now msgAddRef incremented the counter after the variable was loaded into the register but before the register was written back, the reference count would be one too low - exactly as described in the failure scenario above. And msgAddRef could jump right into the middle of the sequence in msgDestruct, because msgAddRef did not try to lock the mutex. So there was nothing that prevented it from executing.

When I first began to realize this, I thought that there should be no issue because the mutex call, after all, is a full memory barrier and as such any update done by msgAddRef should be visible after the mutex lock. That's right, but it doesn't cover the failure scenario I described and I now know of.

What lead to this final problem, I think, was inconsistent analysis. The bug appeared every now and then and was not reproducable. So I was unable to do a good analysis of it. I tried this and that, all going into a good direction, but without any way to verify it. Somehow, also, I did not do a good-enough analysis of the algorithms used - I overlooked those bombs that were in the code for quite some while.

Once I was able to reproduce the issue at least a bit consistently, I could begin to see in more detail where things went wrong. I still didn't see the obvious. I even did an in-depth analysis of the code pathes in question - and have to admit that the obvious things still didn't occur to me.

What I saw, with the help of valgrind's DRD[2] tool, was an issue with a not correctly initialized message mutex. That was quite of a subtle error where a debug system statement caused a too-early aquire of the message mutex, which at that time was not initialized. That could lead to all sorts of interesting effects (including valgrind aborts) and I had high hopes this was the problem source. At first it looked promising, but then I noticed that the problem was introduced in recent builds and could not be the root cause - builds without that issue also crashed. So while it was definitely something we should fix, it was not the sole solution.

We were back to nowhere land... Finally, we begun to come closer to the bug after we were able to generate some error reports from valgrind's DRD tool on Lorenzo's machine (kudos to the valgrind team!).

I should note that I use valgrind on a regular basis and tried a couple of its tools (among others, helgrind and DRD) during my debugging effort. All of them never reported any issue other than mentioned above. I think none of the machines was fast enough to experience that kind of situation so that valgrind could generate an error report (as a side-note, this is good indication that valgrind's log being error-free doesn't mean the program is error-free).

After I got Lorenzo's DRD reports, I had hard evidence that the reference counter was incorrectly handled, something I suspected most of the time, but had outruled numerous times because I did not find any issue with it. Now I knew for sure and could begin to think very hard of why this could happen. I thought even about the most extreme cases - cases I had never thought about wouldn't have been there evidence that the issue was real. This brought me back to the atomic operations. Interestingly, I still didn't see the real issue. But I was smart enough to replace the atomics in msgAddRef by mutex calls. That, of course, cleared the issue, because now the failure scenario (the one described above with the assembly code sample) could no longer happen. Quick tests have proven that no rapid abort happened in the lab environments. This is where we are today. Of course, this does not mean that the bug is actually fixed, but the situation has improved. Neither Lorenzo nor my lab could endure more than 30 to 60 minutes without an abort with the previous code.

After I got this confirmation, I begin to dug into the atomic operations. Lorenzo suggested to look at the assembly code and that somehow brought me to question things like cache coherency and all that. It still took me a while to detect the real cause which turned out to be a rather simple coding error.

So error #4 (this time maybe better to be called learning #4) was: I was so blindfolded by my knowledge of the project, that I was unable to see the hidden bombs in the source code. My reception was filtered by my thoughts. Even though I analyzed things, I did not do a proper analysis, as was unable to correctly reproduce what was written in code. Such situations happen [at least to me ;)]. What is very useful is hard evidence, or a persistent person, that makes you overcome your invalid perception. Unfortunately, I do not know how to overcome this problem in a general way.

After I came to this conclusion, I did further analysis and finally ended up with the failure scenario described at the top of this post (including everything else I wrote). As soon as this was clear, I started another lab which replaced the version with mutex calls by one that relies on atomic operations exclusively. And, as it looks, this seems to perform equally well (but we have not yet changed Lorenzo's lab environment, he still runs the "mutex version"). So this somewhat backs my theoretical findings on the bug sources.

Again, it is far too early to claim victory over the recent race problem. It was far to hard to reproduce, so a few hours in a few labs without problems are not a definite answer - but a trend. Also, as I wrote, what we have found definitely is a bug. It may not be the only one, but it is one for sure. So in any case, the stability of code is better with these fixes than without them.

I will monitor how the fixes perform in practice. As next steps, I'll also update all branches of rsyslog. I have already done this to the branch used by Debian Lenny (we have a separate branch due to their release freeze). That will add another environment to the lab list where we had frequent reproductions (but it is not as useful, as this branch has some potential other threading issues, which were already sorted out in the past). I hope to be able to do clean fixes for the other branches by the end of the week.

As a sum-up, it is important to keep instruction sequences, cache coherency, execution order and all the fine details of a highly parallel system in mind when designing multithreaded code. A very special challenge is getting the right test environment, as race conditions can occur only under rare occasions. A good understanding of the machine's internals is definitely useful, otherwise you may not even be able to think about failure scenarios. Thoroughly reading code and specs should be self-evident, but obviously it pays to remind oneself to it. Getting help from someone who is not familiar with code/spec is definitely useful.

I have written this analysis in the hopes that it is useful as a record for me - and also for others. I think it outlines rather well which programming errors can be made in general and with multi-threaded code in specific. I also hope that it aids understanding of the strange things one need to look at with highly concurrent systems. I think it also proves the point that any (system-level) programmer should know about what exactly the machine does, and what not.

That's basically what I wanted to communicate. But... there is a bit more. Read on if you'd like to get down to some details that may also affect your lab environment and may puzzle you...

Let's have a quick look at optimized vs. non-optimized code. I am using GCC in this example:

Some sample code:
void tester() {
msg_t *pThis;
int currRefCount;
currRefCount = --pThis->iRefCount;
}
with default gcc settings (without -O) generates

311: 55 push %rbp
312: 48 89 e5 mov %rsp,%rbp
315: 48 8b 45 f0 mov -0x10(%rbp),%rax
319: 8b 40 48 mov 0x48(%rax),%eax
31c: 8d 50 ff lea -0x1(%rax),%edx
31f: 48 8b 45 f0 mov -0x10(%rbp),%rax
323: 89 50 48 mov %edx,0x48(%rax)
326: 48 8b 45 f0 mov -0x10(%rbp),%rax
32a: 8b 40 48 mov 0x48(%rax),%eax
32d: 89 45 fc mov %eax,-0x4(%rbp)
330: c9 leaveq
331: c3 retq
and with gcc -O3 it leads to
   10: 83 2c 25 48 00 00 00  subl   $0x1,0x48
17: 01
18: c3 retq
19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
And, just to get the whole picture, this is a simple assignment (without optimization):

source:
void tester() {
msg_t *pThis;
int currRefCount;
currRefCount = pThis->iRefCount;
}
The unoptimized version is what I described in the failure scenario above. With the optimized version, we may be puzzled again. Here, the optimized version works with a single subl and the operand is word-aligned. So, from a naive view, this should be an atomic operation (and, in fact, it is on many systems, most importantly on single-CPU, single-core ones). However, careful reading Intel's architecture manual[3] brings up that it is not an atomic operation on newer, highly parallel, hardware. In any case, even on that hardware the probablity of a collision is far lower than in a 4-instruction sequence.

In theory, this indicates that the bug is more likely to appear in debug mode than in production mode (where the compiler is set to full optimization). In practice, it was just the opposite: in debug mode, it almost never was seen, while in production it occurred every now and then. The explanation, I think, is that you need a certain machine speed to generate the error condition. Rsyslog, when compiled for debug, runs considerably slower compared to when compiled for production. In other words, you need a considerably faster machine to experience the same probability for the race condition to manifest in an error. I think most of the systems were simply too slow to hit this barrier, and so could produce the problem only in release builds. I agree that I have no data backing this claim, but it would at least be a good explanation.

I hope this post is useful. Any comments are appreciated.
Rainer Gerhards

Update (2009-01-29): If you are interested in more details, be sure to read the discussion I had with Lorenzo after finishing this paper.

[1] Intel documentation home - http://www.intel.com/products/processor/index.htm?iid=subhdr+prod_proc

[2] DRD- a thread error detector, http://valgrind.org/docs/manual/drd-manual.html

[3] Intel® 64 and IA-32 Architectures Software Developer's Manual
Volume 3A: System Programming Guide, http://download.intel.com/design/processor/manuals/253668.pdf

How often is rsyslog installed?

Of course, how often a software is actually installed is an interesting number for each project. So it is for rsyslog. And, of course, we do not have any data. While you meet some users on the forum and mailing list, they are only a very tiny subset of the user basis. With infrastructure projects like rsyslog, people often do not even know that they run it (what a shame...). Anyhow, it is motivational (and useful for promotions) to know how often it is installed.

So I started to search for some metrics. A good starting point is the observation that beginning with version 8, rsyslog is the default syslogd for Fedora. So, basically, each instance of Fedora 8+ means an instance of rsyslog. Thankfully, I was able to find some metrics directly from the Fedora project. If I sum up the metrics for F8 to F10, I have around 8 million systems. I guess this includes upgrades and now-dead systems. So we are probably down to 5 million (or is this too optimistic?). As a side-note, I agree that some folks may remove rsyslog in spite of some other logging system. But that will probably be special cases, so I don't think it is useful to try to hard to find out a decent number of these (aka "I ignore that" ;)).

The next major source of installations is probably Debian Lenny. Since a few month, rsyslog has become there default syslogd, too. I have not yet found any metrics for Lenny (do you know? - if so, please mail me). I think the number will be way lower than current Fedora (given that it is not yet flagged as stable). So it will probably not add a big number of systems, maybe half a million?

Another source may be several smaller distributions (like centos) where rsyslog is the default. This adds another source of installations.

Finally, we have the cases where folks intentionally install rsyslog. Sadly, these are the fewest cases, but as I said: this is what you expect from an infrastructure project. And logging is definitely a niche. Few folks have big interest in it. So, if looking just for numbers, these cases are almost irrelevant (of course, from any other aspect these are the most important ones for the project, they really drive it!).

Not having any real Debian metrics, I think a reasonable conclusion is that we have around 5 million systems running rsyslog by today (January 2009). I'll probably refer to that number if someone asks (and some folks begin to ask). If you have a different opinion, metrics, ideas - please comment to this post or email me.

Tuesday, January 27, 2009

Carnival of Logging!

This is a proposal to start a "Carnival of Logging". Huh... what? I guess by now everybody is participating (or at least has written) in one or more blog carnivals. But a Carnival of Logging - am I mad? Who the heck is interested in such a thing?

Yes, I know... Logging is not sexy. Indeed, most folks don't even know there is something like logging (except, of course, if they burn wood ;)). However, some folks blog about (computer) logging. Really. Every now and then I find a few interesting posts about logging and things related to it - like what role does logging play in compliance? What about forensics? What about security, ... So there is some potential interest.

I think, there is a place for a "Carnival of Logging" and I would like to organize it :) I guess we could argue weeks if it make sense to actually do it - or we can simply try. I prefer the later (... and may be proven wrong "the hard way").

This is what I propose:

  • If you are blogging about logging or areas that are closely related to logging (so that you have something log-related at least every know and then), please email me. Let me know which blog post you would like to see highlighted in the first carnival. Please also let me know if you would be interested in hosting future carnivals.
  • I will compile the first carnival of logging out of the messages I receive. It will be hosted on this site. I have no definite schedule for it yet, because I do not know the volume of posts coming in. My goal, however, is to have this done by February, 11th.
  • After the initial post has been done, I'll email all those that submitted carnival entries and ask for new posts ;) I'll pick a host from the list of those that have opted to host and forward entries I have received to that host. The host will then compile a new carnival and post it on his blog.
  • Once a carnival is out, all participants should link to it from their blog.
  • I'll keep an archive of all carnival posts here on this site.
  • The carnival of logging should be on a fixed schedule. That probably depends a bit on the volume, but I suggest that the carnival is written at least bi-weekly.

Well - and now it is up to you! Please send in your carnival entries! I think a Carnival of Logging would be a useful addition to the logging world and hope that others agree. Also, please spread the word, so that we get more exposure and more participants! Also, feel free to post any questions and comments you have.

Monday, January 26, 2009

SyslogAppliance 0.0.6 out

Finally, I made a new version of the syslog appliance. It is not the really big release. Friday, I intended to do just a refresh, but then I ended up integrating a capability to discard messages older than 60 days (obviously an optional feature). Still, it looked like a quick action, but phpLogcon gave me a somewhat hard time. Finally, I even discovered a bug and could fix it.

Probably the next milestone for the appliance is SMP support. I've done some preliminary work, but on the other hand there is so much more to do. Let's see...

Thursday, January 22, 2009

phpLogCon now in FreeBSD ports tree

Good news, folks: phpLogCon is now available in the FreeBSD ports treee. I've just read confirmation on a forum thread. This is obviously very good news, thanks everyone for making that happen. Given that there is also a port for rsyslog, now both components can work together and benefit from each other.

Of course, phpLogCon does not require rsyslog. It can work perfectly with any other syslogd, as long as it is pointed to the right files or databases. But having rsyslog's ability at hand in addition to phpLogCon is quite handy - and vice versa.

Good news to start a day ;)

Wednesday, January 21, 2009

NASA list server compromised?

As a space geek, I am subscribed to NASA's HSFNEWS mailing list. When I looked at my mailbox this morning, a spam message that claimed to have been posted via the Nasa list server caught my attention. Obviously, it is quite easy to forge email and so I thought that this may be a fake, too. However, closer examination reveals headers that makes me think this could be a real thing.

Of course, HSFNEWS is just one of the many mailing lists NASA offers and also of course it is run on an auxiliary system, invalid messages slipping through can have quite bad effects. Of course, a message with subject

"[HSFNEWS] She'll always want to give head now"

will hopefully immediately classified as spam by anyone (or do you think the message is about alien encounters? ;)). But what if the message would be much more carefully crafted to carry out something evil? After all, the message could look much like it comes from an official NASA source. Just think about the various Obama hoaxes and scams that we have seen lately?

I am still not 100% convinced that the mail actually originated from the NASA list server (I have tried to contact someone in charge over there and hope to get some results). To help you get an idea yourself, here is the complete message source, except a few things on my local delivery record as well as valid mail addresses that do not need to be posted here.

If someone has an opinion if the mail was run over NASA's server, please post a comment or drop me a mail.


MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C97B42.7A7F7080"
Received: from jsc-listserv-01.jsc.nasa.gov (jsc-listserv-01.jsc.nasa.gov
[128.157.5.25]) by mailin.adiscon.com (Postfix) with ESMTP id 06205241C002
for ; Tue, 20 Jan 2009 21:52:51 +0100 (CET)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: from jsc-listserv-01.jsc.nasa.gov (jsc-listserv-01
[128.157.5.25]) by jsc-listserv-01.jsc.nasa.gov (8.13.1/8.13.1) with ESMTP
id n0K7cgeV024815; Tue, 20 Jan 2009 15:01:22 -0600
Received: by JSC-LISTSERV-01.JSC.NASA.GOV (LISTSERV-TCP/IP release 15.0)
with spool id 553828 for HSFNEWS@JSC-LISTSERV-01.JSC.NASA.GOV;
Tue, 20 Jan 2009 15:01:20 -0600
Received: from 200-127-202-12.cab.prima.net.ar
(200-127-202-12.cab.prima.net.ar [200.127.202.12]) by
jsc-listserv-01.jsc.nasa.gov (8.13.1/8.13.1) with ESMTP id
n0KKPY2D029413 for ; Tue, 20 Jan
2009 14:25:35 -0600
Return-Path:
X-OriginalArrivalTime: 20 Jan 2009 21:03:01.0983 (UTC)
FILETIME=[7B156EF0:01C97B42]
List-Owner:
Approved-By: {removed}@NASA.GOV
Content-class: urn:content-classes:message
Subject: [HSFNEWS] She'll always want to give head now
Date: Tue, 20 Jan 2009 21:25:34 +0100
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [HSFNEWS] She'll always want to give head now
Thread-Index: Acl7Qns+aZRN9mnKS56dl4osL2myOw==
List-Help: ,

List-Subscribe:

List-Unsubscribe:

From: "joynt"
To:
Reply-To: "hsfnews"


------_=_NextPart_001_01C97B42.7A7F7080
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Can't see images?
To view this email as a web page, go here =
{actual spam removed}

RFC 3195 back in the game?

RFC 3195 was thought to be the solution for reliable syslog. It is based on TCP and the BEEP protocol. It was written in November 2001 but never gained much attention. The premier reason everyone tells you is complexity of BEEP (and lack of toolkits). A few years ago, I wrote my own logging-related RFC 3195 toolkit, liblogging. It, too, did not get much momentum.

Anyhow, I used a modified version of liblogging to offer RFC 3195 support under Windows as part of the MonitorWare product line. Again, we never heard much of this feature. In rsylog, I created an input plugin for RFC 3195. At that time, however, I already had the feeling 3195 was a failure. So I was hesitant to implement an output plugin, too. And, as expected: nobody every cared, except for some folks building packages. And these not for practical needs but for the sake of getting packages for everything...

So up until now, I would conclude that 3195 is indeed a failure. However, there seems to be some increasing interest. At least, I got a couple of questions the past weeks on RFC3195 and Adiscon, my company, just got a not-so-small order of its EventReporter product which explicitly has RFC 3195 put into the requirements. Is this a sign of increasing interest? Or is just somebody filling check mark items? This remains to be seen.

So, there seems to be a slight chance that RFC 3195 is getting revived. Maybe it took just some year so that the idea could ripen. In any case, I am prepared for RFC 3195 demand. Maybe finally doing all that work begins to pay off...

Monday, January 19, 2009

Theoretical vs. Practical Performance

I found an interesting post "Algorithm Performance in Academic Papers and Practice" by fellow security blogger Steve Burnett on the SecurityBloggersNetwork.

Steve questions whether theoretical performance gains, often given based on big O notation, can really be realized in practice. He claims they often can not.

I tend to agree to his point, at least in broad terms. One must first remind oneself that big O notation is quite coarse, it tells you about the potential extremes. In practice, however, these extreme cases may routinely never hit. Even further, I think it depends very much on what the algorithm in question is actually doing. If it is "just" doing calculations on its own, theoretical performance benefits can much better be predicted than if there is any external reference.

The obvious case is code that needs to do some library or operating system calls inside the algorithm. A good example is the performance optimization I and David Lang did on rsyslog in fall of 2008. Here, and I have to admit partly to my surprise, it turned out that optimizing rsyslog algorithms actually had almost no effect in boosting the daemon's performance. Why? Simply because a hand full of system calls, many time-related, used up the majority of execution time. So rather than optimizing the algorithms used, we optimized out OS calls and that had a very big effect (and even after that initial optimization, there is still much room for improvement just by looking at the APIs). Of course, this is an extreme sample, because usual syslog server processing is not at all computational and the frequent context switches themselves are performance intense. The picture for a graphics application is probably totally different.

However, many less obvious cases exist, and I guess lots of them have to do with the fact that multiple processes and/or thread are executed. So resources are shared. On such a system, for example, theoretical performance gains may be lost due to task switches which purge vital data off the CPU cache. Even worse, a theoretically optimized algorithm may require additional main memory, which may, in practice, force cache purges because the cache size now is insufficient. Funny, eh?

Wrap-up: big O notation is useful, but for practical needs, it needs to be taken with a grain of salt. For real world deployments, actual performance testing is quite important. And as a side-note, test results on a dedicated system may be quite different from practical performance on a system where other things are also executed...

WinSyslog German Site

We are selling a Windows Syslog daemon (WinSyslog) for many, many years now (since 1995 if I remember correctly). Interesting is the "language issue". Back at the late 90s, we had English and German pages for that product. Some time later, we dropped the German pages because almost nobody ever accessed them (funny, ain't it?).

Now we are giving it another shot. While talking with some peers, they claimed there is more demand for German language in IT security today than it was 10 years ago. Really? If so, I have to admit I am surprised. I thought that the IT world speaks English and the IT security/auditing world even more so. Anyhow, I always like to experiment. So we at Adiscon agreed to translate some important content of the WinSyslog pages into German and see what happens.

As a side-note, the discussion with my peers started another experiment which did not require discussions inside the company. Rsyslog got a German language support forum in October 2008. Guess what? There is only a single user post in it, and that post tells that the poster thinks it is unnecessary to have a German language forum. So far, it looks like I was right - but let's see what a product site brings ;) (It sounds somewhat logical that an open source support forum has different metrics than an commercial software product site, so I think there really can be different results).

Thursday, January 15, 2009

Use of application-level acks in RELP

I received a very well crafted question about RELP reliability via the rsyslog mailing list this morning. I think it makes perfect sense to highlight this question here in the blog instead of letting it die unread and hard to find in the mailing list archives. Before reading this post, it would be useful to read my rant on "On the unreliability of plain tcp syslog" if you have not already done so. It will greatly help understand the fine details of what the message talks about.

Here we go, original posters's text in italics, my replies in between it:

In my research of rsyslog to determine its suitability for a
particular situation I have some questions left unanswered. I need
relatively-guaranteed delivery. I will continue to review the
available info including source code to see if I can answer the
questions, but I hope it may be productive to ask questions here.

In the documentation, you describe the situation where syslog silently
loses tcp messages, not because the tcp protocol permits it but
because the send function returns after delivering the message to a
local buffer before it is actually delivered.

But there is a more-fundamental reason an application-level ack is
required. An application can fail (someone trips over the power cord)
between when the application receives the data and when it records it.

1. Does rsyslog send the ack in the RELP protocol occur after the
message has been safely recorded in whatever queue has been configured
or forwarded on so its delivery status is as safe as it will get (of
course how safe depends upon options chosen), or was it only intended
to solve the case of TCP buffering-based unreliability?


RELP is designed to provide end-to-end reliability. The TCP buffering issue is just highlighted because it is so subtle that most people tend to overlook it. An application abort seems to be more obvious and RELP handles that.

HOWEVER, that does not mean messages are necessarily recorded when the ACK is sent. It depends on the configuration. In RELP, the acknowledgment is sent after the reception callback has been called. This can be seen in the relevant RELP module. For rsyslog's imrelp, this means the callback returns after the message has been enqueued in the main message queue.

It now depends on how that queue is configured. By default, messages are buffered in main memory. So when rsyslog aborts for some reason (or is terminated by user request) before this message is being processed, it is lost - while the sender still got a positive ACK. This is how things are done by default, and it is useful for many scenarios. Of course, it does not provide the audit-grade reliability that RELP aims for. But the default config needs to take care of the usual use case and this is not audit-grade reliablity (just think of the numerous home systems that run rsyslog and should do so in the least intrusive way).

If you are serious about your logs, you need to configure the engine to be fully reliable. The most important thing is a good understanding of the queue engine. You need to read and understand the rsyslog queue docs, as they form the basis on which reliability can be built.

The other thing you need to know is your exact requirements. Asking for reliability is easy, implementing it is not. The more you near 100% reliability (which you will never reach for one reason or the other) the more complex scenarios get. I am sure the original post knows quite well what he want, but I am often approached by people who just want to have it "totally reliable" ... but don't want to spent the fortune it requires (really - ever thought about the redundant data centers, power plants, satellite and sea links et all you need for that?). So it is absolutely vital to have good requirements, which also includes of when loss is acceptable, and at what cost this comes.

Once you have these requirements, a rsyslog configuration that matches them can be designed.

At this point, I'd like to note that it may also be useful to consider rsyslog professional services as it provides valuable aid during design and probably deployment of a solution (I can't go into the full depth of enterprise requirements here).

To go back to the original question: RELP has almost everything that is needed, but configuring the whole system in an audit-grade way requires (ample) work.
2. Presumably there is a client API that speaks RELP. Can it be
configured to return an error to the client if there is no ACK (i.e.
if the log it sent did not make it into the configured safe location
which could be on a disk-based queue), or does it only retry? Where is
this API?


The API is in librelp. But actually this is not what you are looking for. In rsyslog, an output module (here: omrelp) provides the status back to the caller. Then, configuration decides what happens. Messages may be discarded, sent to a different destination or retried.

With omrelp, I think we have some hardcoded ways to preserve the message, but I have no time yet to look this up in detail. In any case, RELP will not loose messages but may duplicate few of them (within the current unacked window) if the remote peer simply dies. Again, this requires proper configuration of the rsyslog components.

Even with that, you may loose messages if the local rsyslogd dies (not terminates, but dies for some unexpected reason, e.g. a segfault, kill -9 or whatever) but still has messages in a not persisted queue. Again, this can be mitigated by proper configuration, but that must be designed. Also, it is very costly in terms of performance. A good reading on the subtleties can be in the rsyslog mailing list archive. I suggest to have a look at it.
Certainly the TCP caching case you mention in your pages is one a user
is more likely to be able to reproduce, but that is all the more
reason for me to be concerned that the less-reproducible situations
that could cause a message to occasionally become lost are handled
correctly.


I don't think app-abort is less reproducable - kill -9 `cat /var/run/rsyslog.pid` will do nicely. Actually, from feedback I received, many users seem to understand the implications of a program/system abort. But far fewer understand the issues inherent in TCP. Thus I am focusing so much on the later. But of course, everything needs to be considered. Read the thread about the reliable queue (really!). It goes great lengths, but still does not offer a full solution. Getting things reliable (or secure) is very, very challenging and requires in-depth knowledge.

So I am glad you asked and provided an opportunity for this to be written :)

Rainer

Strong passwords? Forbidden!

American Express, as a bank and card issuer should be a fairly security sensitive company. Right? Well, it looks like they have not yet learned their lesson. Occasionally, I log in to my AmEx account to gain access to memebership rewards (these nice gimmicks that shall trick you into charging to AmEx as much as possible). I tend to have my credentials not at hand when doing so, but thankfully AmEx has a quite secure system to recover your credentials.

What really bugs me is their password requirement. A password can have a maximum of 8 characters and consist only of letters and numbers! Ouch... what about strong passwords? They are simply forbidden by AmEx. The funny thing is that the web site doesn't even complain when you enter a too-strong (aka longer or alphanumeric) password. It simply ignores the extra characters. Some time last year this drove me crazy as I could not log in after changing my password. Guess what, I used a too strong one and of course it didn't match to what the system generated. I called customer service and also complained about being forced to use insecure passwords. That was several month ago.

New year, new try - old problem... Nothing learned, still 8 chars max and only letters and number. Frankly, AmEx, who is advising you on security? I really wonder if under US law AmEx is responsible if someone breaks into my account. I think they should...

Wednesday, January 14, 2009

new syslog RFC will not advance...

I thought that after 5 years of IETF syslog WG work, I'd already be through all potential sources of frustration. Wrong! Last Friday, I was notified that the otherwise ready-to-be-published RFC could not be published because the IETF requires a copyright assignment to the trust, in which the author must grant all rights. Of course, he must also do so on behalf of every contributor.

Quite frankly speaking, this is more or less impossible for this draft. Among the many reasons is that it would be extremely time-consuming to verify each and every (small) contribution from the past 5 years. Also, I copied (perfectly legal than) text from other RFCs, who do not know who contributed. There are more reasons, but I do not like to elaborate on all of them again.

The bottom line is that the syslog RFC series is again stalled. The IETF has at least identified there is a problem and a work-around is being discussed. A decision will be made on February, 15th. Let's hope for the best, otherwise all the work would probably be lost.

This IMHO is also a good example on what a too-far streched system of intellectual property rights can do. This, and also software patents, are a threat to our freedom. The stop progress and unnecessarily limit out capabilities. Hopefully more and more people realize that and will tell their governments to stop this abusive system.

Thailand is going syslog...

I found an interesting read in "The Nation", one of Thailand's largest business dailies. They talk about the economic crisis and the way Thailand plans to reduce negative effects. There is a 5-point initiative in place. Of interest for us the the fifth and final point:

Finally, the association will focus on security, which promises to be this year's main technology trend. It will urge software companies to become more familiar with Syslog, which is a standard for forwarding log messages in an IP network, but is also typically used for computer system management and security auditing.

So, as it looks, Thailand is betting on security. This is obviously a good movement. Interestingly, they seem to have identified logging, and syslog in specific, to be a major building block in this endeavor. That's a bit surprising, given the typical weaknesses of syslog. But they've probably identified the broad potential this protocol has. Maybe I should look a bit more towards Asia with rsyslog and phpLogCon as well as the Windows product line.

Joined the Security Bloggers Network

Yesterday I joined the Security Blogger's Network. I discovered this interesting network by accident and thought it may be worth contributing to it. While I am writing mostly about logging and log analysis, this definitely has a lot to do with security. So I asked Alan, the person in charge, if he'd consider adding my blog - man, was he quick. I got a positive response soon after my request and have also already been added to the site. Great. So welcome all new readers.

The good thing is that this motivates me to write a bit more about the security aspects, which I think is a good thing to existing readers, too.

Drop me a comment if you have any opinion on me appearing in the network - especially if you have an idea of what you would like to read about logging.

Tuesday, January 13, 2009

SyslogAppliance Keyboard

This morning, Google alerts brought up a nice blog post on how to reconfigure SyslogAppliance's keyboard settings.

I think some users already asked about keyboard reconfiguration. However, the blog post suggest that it is much more painful to do than I thought (I agree with the author that "dpkg-reconfigure console-setup" should have done the job). I'll see that I finally include this functionality in the automated setup. It should not be too hard. After all, I have a "run once" script already in place.

Monday, January 12, 2009

Back from the break ;)

Hi folks,

I am right now back from my extended xmas break. Well, actually I've been gone right after xmas and been away - even mostly without email - for two weeks.

I was delighted to see that the rsyslog community was quite active during this time (which usually has low activity at all due to the holidays). An the sad side, that also means I have a couple of bug reports outstanding. One I managed to fix today, the others will need a little more time. Obviously, there are quite a lot of things going on, and these need to be taken care of, too. I need to do a few changes to our internal infrastructure, and then I'll look into rsyslog, phpLogCon and the appliances.

One of the major undertakings I hope to finish in the not so distant future is fixing a stability bug that seems to occur on 4+ core machines only. I got another but report over the holidays and hope that I get enough momentum to finally track it down - we'll see...

That's it for now, I just wanted to keep you updated.

Oh - and did I mention happy new year ;) [it's not too late right now...]

Rainer

rsyslog 8.31 - an important release

Today, we release rsyslog 8.31. This is probably one of the biggest releases in the past couple of years. While it also offers great new fu...