Monday, December 31, 2007

rsyslog work log and future directions

Hi folks, probably the last rsyslog work log post for 2007. Thanks for sticking around - and hopefully I'll see you again in 2008. It'll become a very exciting year, with a lot of new features. I am eager to implement what is right now on my head, and I'll most probably will start with modifying the message queue, an endeavor that will ultimately lead to store-and-forward capability just like in syslog-ng's premium edition. And the good news is that I hope to finish that in January 2008 ;) -- what also means that I have made up my priorities. Was not an easy job, and I hope I got it right. So store-and-forward with enhanced output threading is first and the other things will follow later. To me, the hardest decision was to put off expressions, another feature at least I would like to see the sooner the better.

But now back to the work log:
2007-12-27
- added $UDPServerAddress config directive
- added capability to have multiple UDP listeners running concurrently
- applied cross-platform patch from darix to facilitate GSS-API compile
on more platforms
- some cleanup
- internal restructuring in omfwd.c - stage work for further modularization
I think I also fixed a bug as a side-effect - but not looked to much at it
2007-12-28
- took TCPSend() apart and made it generic via function pointers
- moved TCPSend() and frame building code to tcpsyslog.c
- omgssapi created
- removed gss-api code from omfwd.c

Friday, December 28, 2007

gss-api and rsyslog v2

I initially sent this message only to the mailing list. But now I think it make sense to reproduce it here. So there we go:

I am working on the modular structure of rsyslog v3. I am currently revisiting gss-api support. I notice that with the current omfwd, it will be extremely hard to separate gss-api support into its own module. Doing so will break backward compatibility to the configuration file.

GSS-API has been out only for a few days, and mostly over the holiday period. So it is much less of a concern if we introduce now some changes that will case rsyslog.conf format modifications. Much less trouble than when we release v2, a release expected to be in wide use for at least half a year, if not much longer. V2 released with the current syntax would require me to do some tricks in v3 to keep compatibility. Quite complex.

So I decided to create a omgssapi for v3 and extract the gss-api code from omfwd. It looks like this can be done without too much code duplication. There will be some duplicate code, but it will shrink as v3 continues to be developed. Once I have a good working version, which I expect very soon, I will backport that to the v1/2 source tree. I'll then do a new v1 release with a slightly incompatible gss-api config file syntax. After this is out for a few days, I hope I can than finally push out that version as v2.

I hope this is a good decision. I think it will save us major future trouble at the expense of a relatively slight disturbance in the late v1 timeline. I guess most user's won't even notice there is a change.

As always, Feedback is appreciated.

Thursday, December 27, 2007

Things to do in rsyslog...

I have made good progress with rsyslog's input modules. As it looks, the basic things are done and the input module interface has been proven to be both quite stable as well as very simple. It doesn't yet support different instances, but I begin to think that I do not even need them - also not in the long term.

Of course, most of the current input modules are not clean modules. They have a lot of dependencies to other parts of the code, which is not yet able to be dynamically loaded. But at least there is a foundation on which additional modules could be build. Getting the current input modules to be real clean modules will require further stage work. Many thanks need to be done.

So what to do next? It now comes down to both a matter of priorities and dependencies. I am writing this note here mostly for myself. It helps my clear up my thoughts and will also probably serve as a reference for quite a while. My thoughts may be hard to understand - sorry for that. But I thought I make them public when I write them down - even if they are not really targeted toward others. I still hope they may help you get some more background info.

So what's to do:

- find a way to handle global settings
- multi-threaded output modules
a perquisite for
- create queued outputs (write to queue if action fails and
restart when it resumed)
- re-write way config file is read
probably perquisite for:
- create expression support
- in templates
- in selector filters
- create interface for (loadable) user function modules
- create a system to allow loading "library" loadable modules
(e.g. network library for imudp, imtcp, ...)
- separate GSSAPI from plain TCP (requires libs and lib extension system)

xmas rsyslog work log

I've been a bit busy with rsyslog over the xmas break. Here are the changes:

2007-12-25
- moved some more net functionality out of syslogd.c - stage work
- fixed duplicate license text in syslogd.c - made it ambigious
- moved udp net code - again, stage work
- moved some of the udp input code to its right place
2007-12-26
- moved cross-platform define for AI_NUMERICSERV to net.h
- made udp code somewhat less dependable on global variables -- stage work
- removed omfwd code dependency on "finet"
- removed imudp code dependency on "finet"
- removed active INET code from syslogd.c - still some auxiliary things
remain
- fixed socket leak in omfwd.c
- removed global variable LogPort
- removed gloable variable AcceptRemote and external def of bFinished

Monday, December 24, 2007

Seasons Greetings to Everyone

My best wishes to everyone! Let me share this lovely impression:

fog and cool temperatures create a christmas wonderland - seen 2007 in GermanyI thought this image conveys much of the beauty of our planet earth and the hopefully peaceful holiday season. I wish all of you great holidays, nice gifts and time with your beloved ones.

In 2007, we've gone a long way. Both from an Adiscon perspective, with lots of new product releases and great features and also from the rsyslog point of view. And, of course, there were private highlights as well, for example my unforgettable trip to view space shuttle Discovery's sts-120 launch. Thanks everyone for your support and all the kind words I received!

Once again, a great holiday season to all of you!

PS: if you enjoyed the image above, you may want to have a look at my xmas 2007 impressions gallery.

Friday, December 21, 2007

rsyslog work log for 2007-12-21

It has been a good day today! Finally, the alarm() call has been deleted! :) That was something long on my agenda, but I couldn't do it without the redesign of the inputs. The alarm() was not really a big issue, but it became an annoyance to me because it was so hard to remove.

I would also like to mention that I will do only occasional work during the holiday period. So I do not expect more serious changes until early January. Some releases, however, are due next week (maybe 2.0.0).

Here is the detailed rsyslog worklog for today:


- removed no longer needed mutex from omfwd
- released a preview of 3.0.0 "as is" to mailing list - just to get the idea
- begun work on imtcp
- created first version of imtcp (still very much depending on syslogd.c for
configuration and a lot of other things)
- cleaned up code (resulting in some shuffeling from syslogd.c to the
"right" module)
- prepared for imudp
- created an initial version of imudp.c. The majority of UDP reception code
is now in that module and it is dynamically loadable. HOWEVER, that doesn't
mean it is a proper module. There are still many, many dependencies on
global variables, cross-module calls and such. However, havin the code base
separated allows me to carry out some other cleanup before I return to
create a really clean implementation of these modules. So it is kind of a
stage work. Just don't mistake it with "the real thing"...
- removed code no longer needed
- finally, alarm() has gone away :) -- this is now done by the main thread
- some cleanup

rsyslog work log for 2007-12-20

Yesterday was a very busy day for rsyslog. I am on a good path to input modularization, but the hardest part needs still be done ;)

Here is the log:

- bugfix: fixing memory leak when message queue is full and during
parsing. Thanks to varmojfekoj for the patch.
- working on a potential race condition on the new input module
interface. See newsgroup posting for details on the issue:
http://groups.google.com/group/comp.programming.threads/msg/330b9675f17a1ad6
I tried some mutex operations but came to the conclusion that this
does not really help. So I have now switched to plain thread
cancellation, which so far seems to be OK. Need more practical
experience with other input modules to make a final decision. Thus
I leave all code in and have just disabled the problematic
code.
- implemented $klogUseSyscallInterface config directive
- implemented $klogSymbolLookup config directive
- moved unix socket code to its own module (imuxsock)
- implemented $OmitLocalLogging config directive
- bugfix: memory leak in cfsysline.c/doGetWord() fixed
- implemented $SystemLogSocketName config directive
- implemented $AddUnixListenSocket config directive
- MILESTONE reached: imuxsock initial version done
- removed single-threading support for sending TCP messages; caused
simplyfication of output module interface as well as core syslog
processing.
- moved udp send code to its own function

Thursday, December 20, 2007

rsyslog work log...

Here is my recent rsyslog work log:

2007-12-18
- removed files from cvs that not belong there (thanks to Michael Biebl for
pointing that out)
- restructured #include's somewhat thanks to Michael Biebl
- code cleanups thanks to Michael Biebl
- applied Michael Biebl's patch to enhance $includeconfig to support
wildcard filenames
2007-12-19
- applied some more cleanup provided by Michael Biebl
- applied enhanced gss-api functionality provided by varmojfekoj
- GSS-API support for syslog/TCP connections was added. Thanks to
varmojfekoj for providing the patch with this functionality
- release 1.21.0
- added the -c option
- enhanced -c option support (some basics)
- bugfix: llDestroy() left the list with invalid root/last pointers

Wednesday, December 19, 2007

modules, core functionality and rsyslog v3...

As I have written, I have begun to work on rsyslog v3 (and so far I am pleased to say that I have made quite good progress on it). One of the things with rsyslog v3 is that it will have an even more module architecture, utilizing loadable modules for almost everything. I asked on the mailing list about backward compatibility and I received this very good response by Michael Biebl:

One thing I was wondering:

If you intend to shift all (even core) functionality into loadable modules, how do do you handle things like --help or available command line options like -m?

Do you want to hardcode it or will you provide an interface, where rsyslog will query the module about its help message and available options.

I'm also still a bit uncertain, if moving everything resp. core functionality to modules is a good idea (for problems you already mentioned). Imho having all core functionality in a single binary is simply much more robust and fool proof. For things like the SQL db output plugin, the module interface is great, because it avoids to pull in large library and package dependencies and allows to install them on a as need basis. For other functionality I still need to recognize the benefits.

Rainer, could you roughly sketch, how you envision to break rsyslog into loadable modules in v3. Which kind of functionality would be loadable as module, which functionality do you plan to keep in the rsyslogd binary. A listing of all (planned) modules + the provided functionality and requirements would really help.

Another thing: Say you move the regexp support into a separate module. If a regexp is then used in rsyslog.conf, will you bail out with an error, simply print a warning (which could go unnoticed and the poor administrator doesn't know why his regexp doesn't know) or load modules on demand.

For the latter you'd need some kind of interface to query the *.so files for their supported functionality. I.e. the modules would export a list of config directives it supports and rsyslog could upon startup query each available module and create a map.

So, e.g. the ommysql module would export its support for the :ommysql: config directive. Whenever rsyslog finds such a config directive it could/would load the modules on demand.

Same could be done for the command line parameters. The imklog module would export, that it supports the -m command line parameter. Whenever that commandline parameter is used, rsyslog would know which module to load.

There are only rough ideas and there is certainly still much to consider. But what do you think about the basic idea?

This is a great response - it not only asks questions but offers some good solutions, too. It comes at a perfect time, too, because there is much that is not yet finalized for v3. For sure I have (hopefully good ;)) ideas, but all of them need to be proven in practice. The issues that come up here are a good example.

So, now let me go into the rough sketch about I envision what v3 will do. Note that it is what I envision *today* - it may change if I get good reasoning for change and/or smarter solutions.

First, let me introduce two blog posts which you may want to read before continuing here:

And, most importantly, this post already has the root reasoning for pushing things out of the syslogd core:
Let me highlight the two most important parts from that later post:

This is exactly the way rsyslog is heading: we will try to provide an ultry-slim framework which offers just the basic things needed to orchestrate the plug-ins. Most of the functionality will indeed be available via plug-ins, dynamically loaded as needed.

... With that design philosophy, we can make rsyslog really universally available, even on low-powered devices (loading just a few plug-ins). At the high end, systems with a lot of plug-ins loaded will be able to handle the most demanding tasks.
And this is actually what the v3 effort is all about: rsyslog should become as modular as possible, with the least amount of code in the core linked binary and everything else provided via plugins. I still do not know exactly how that will happen, I am approaching it incrementally. I am now at the input plugins and trying to set them right.

In the longer term, there will be at least three different types of plugins: output, input and "filter". I think I do not need to elaborate about the first to. Filter plugins will provide work together with expressions, another feature to come. It will enhance the template and filter system to provide a rich expression capability supporting function calls. For example, a template may look like this in a future release:

$Template MyTemplate, substr(MSG, 5, 10) + "/" + tolower(FROMHOST) + "/"

and a filter condition may be

:expr:substr(MSG, 5, 10) == "error" /var/log/errorlog

Don't bash me for the config format shown above, that will also change ;)

Regexpt functionality will then be provided by something like a regexp() function. Functions will be defined in loadable modules. Pretty no function will be in the core. A module may contain multiple functions.

Bottom line: almost everything will be a loadable module. If you do not load modules, rsyslog will not do anything useful.

Now a quick look at the command line options: I don't like them. Take -r, for example. Sure, it allows you to specify a listener port and also allows to convey that a listener should be started at all. But how about multiple instances? How about advanced configuration parameters? I think command line options are good for simple cases but rsyslog will provide much more than can be done with simple cases. I favor to replace all command line options with configuration file directives. This is the right place for them to be. Except, of course, such things like where to look for the master configuration file.

Which brings up backward compatibility. As you know, I begin to be puzzled about that. After all, rsyslog is meant to be a drop-in replacement for sysklogd. That means it should run with the same options like sysklogd - and should also enable administrators to build on their knowledge with sysklogd. Tough call.

Thankfully, sur5r introduced the idea of having a compatibility mode. He suggested to look at the absence of a rsyslog.conf file and then conclude that we need to run in that mode. That probably is a good suggestion that I will pick up. It can also be extended: how about a, for example, "-c" command line switch. If absent it tells rsyslog to use compatibility mode. And it should absent in previous versions as well as sysklogd, because it was not defined there.

Now let's think. If we know we need to provide compatibility, we can load a plugin implementing compatibility settings (again, moving that out of the core functionality). Once loaded, it could analyze the rest of the command line and load whatever modules are necessary to make rsyslogd correctly interpret a post v3 configuration file. That way we have a somewhat larger then necessary memory footprint, but all works well.

Then back to native mode. Here, indeed, I'd expect that the user loads each and every module needed. I assume, however, that for any typical package the maintainer will probably load all "core" functionality (like write to file, user message, several inputs, common filter functions, ...) right there in the default rsyslog.conf. This make sense for today's hardware. It also will make the config quite foolproof. A good way to implement that would work on the semantics of $IncludeConfig. How about:

$ModLoad /whereever/necessrayplugins/

which would load all plugins in that directory.

The key point, however, is that in a limited environment, the very same binaries can be used. No recompilation required. This would be scenarios with e.g. embedded devices - or security sensitive environments where only those components that are absolutely vital should run (which is good practice because it protects you from bugs in the not-loaded code).

I personally find it OK to handle the situation as described above. I don't like magic autoloading of modules.

This modular approach has also great advantages when it comes to maintaining the code and making sure it is as bugfree as possible. Modules tend to be small, modules should be independent of each other. So testing and finding/fixing bugs that escaped testing should be considerably easier than with the v2 code base. There are also numerous other advantages, but I think that goes to far for this post...

Comments are appreciated. Especially if you do not like what I intend to do. Now is the time to speak up. In a few weeks from now, things have probably evolved too far to change some of the basics.

Tuesday, December 18, 2007

rsyslog changes for 2007-12-17

Yesterday's rsyslog changes:

2007-12-17
- fixed a potential race condition with enqueueMsg() - thanks to mildew
for making me aware of this issue
- created thread-class internal wrapper for calling user supplied thread
main function
- solved an issue when compiling immark.c on some platforms. LARGEFILE
preprocessor defines are changed in rsyslog.h, which causes grief
for zlib. As a temporary solution, I have moved rsyslog.h right at the
beginnng of the include order. It's somewhat dirty, but it works. I think
the real solution will be inside the autoconf files.
- moved thread termination code out to threads.c
- implemented $MarkMessagePeriod config directive
- command $ResetConfigVariables implemented for immark.c
- begun imklog, replacing klogd.c (finally we get rid of it...)
- implemented $DebugPrintKernelSymbols
- implemented afterRun input module interface function
- implemented $klogSymbolsTwice config directive

As you can see, it was quite a busy day. The input module interface has already materialized for the most part.

Tracking file deletions on Windows

Have you ever wondered why an important file magically disappeared? My co-worker Andre has worked with a couple of folks, who didn't like that scenario. As a corporate policy, deletion to some important file locations must be logged. Andre has created a nice guide for use under Windows. He utilizes MonitorWare Agent's event log monitoring capabilities together with its advanced rule engine.

That guide is not only a good source of information if you need to implement Windows file deletion tracking. It also shows nicely what can be done with MonitorWare. It definitely helps understanding the full potential and how to make best use of it.

I suggest you have a quick look at Andre's guide to tracking file and directory deletions under Windows.

Monday, December 17, 2007

rsyslog changes upto 2007-12-14

This is my worklog for rsyslog:

2007-12-12
- begun to shuffle the mark code to a separate module - that will take some
time and definitely require much more code shuffling. This is the begin
of the input module interface

2007-12-14
- created new branch for what will become 2.0.0 stable
- begin work on on immark, the first input module. In the long term
this will lead to a complete rewrite of the input system
- changed license to GPLv3 (for what is to become rsyslog v3)
- moved core threading helpers out of syslogd.c
- remove USE_PTHREADS macro from all sources except omfwd.c (I wait
for a gssapi patch from Red Hat, removing these macros would probably
cause unnecessary grief...)
- tried approach to terminate input module thread via pthread_kell() - so
far, seems to work ok
- begun to create input module interface and macros
- changed module interface to include function to query type
- milestone: can load input module dynamically, but can not do anything
with it - now I need to think about activating IMs...

Friday, December 14, 2007

begun working on rsyslog v3

I reproduce a note here that I sent out to the mailing list this morning. In the mean time, I have done most of the work in CVS.

As you know, I am looking at the way threading is supposed to work in future releases and, most importantly, looking at the inputs (like mark message generation).

Around summer, I wrote that I will probably need to release new major versions when we go into multithreading redesign. It looks like we have reached this stage. I tried to keep a single code base that still supports both single- and multi-threaded operations. I have looked into this the past days and I need to say that it creates a lot of complexity and hard to understand code.

For this reason, I think it is finally time to branch the code based and release some new versions.

Soon, I will create a branch for the current 1.20.1 code base. That will only receive bug fixes, but no new development (except, I guess, GSSAPI which I about to be contributed by Red Hat). When we are confident the last changes worked well and introduced no new bugs, there will be a version 2.0.0 stable release based on that code base.

CVS head, however, will then be rsyslog version 3. It will receive the new input module interface. It requires pthreads, because there is no way input modules and many more of the new desired features can be implemented without them. Consequently, I will remove all single-threading code from it, resulting in an easier to understand code base. Please note that I expect this code to change dramatically when it is being modified to be more modular (much like it was when I introduced modular outputs in summer). Please note that I will apply any non-bugfix patches to this code base, only.

I have somewhat bad feeling of going ahead with implementing a more sophisticated and more parallel multi-threading while we still have an issue with the segfault. However, I think by now we did everything imaginable to capture that rare bug. I have come to the conclusion that the best chance to find it is go ahead and implement the more sophisticated design. That will lead to a review, and rewrite, of much of the code in question, uncovering this we didn't think about before. The recently discovered race condition is an excellent sample.

One thing about the license: rsyslog 2 will stay with "GPL v2 and above" license, but rsyslog V3 will be licensed under "GPL v3 and above". I already wrote about that change. It is my firm believe that GPL v3 brings benefit to our freedom to use digital goods. I am a strong oppose of digital restrictions management (DRM) and software patens and I do not like the idea that rsyslog benefits anyone who encourages these things. I hope for your understanding.

I will set stage now for these changes and will do a web announcement soon. Please don't be surprised that rsyslog v3 will be available before v2, you now know the reason.

Wednesday, December 12, 2007

rsyslog changes upto 2007-12-12

It looks like I have become too lazy in reporting my changes. I'll try to be more quickly again in the future. Here is the part of the work log that is missing. Please note that it does not always mention my hard thinking about the new threading model ;)

2007-12-07
- applied patch from Michael Biebl to finally fix the -ldl cross-platform
issue
- fixed some type conversion warnings that appeared on 64 bit machines - these were in
debug statements, so indicated no real problem
- some code cleanup
- released 1.20.0 (finally ;))

2007-12-11
- When a hostname ACL was provided and DNS resolution for that name failed,
ACL processing was stopped at that point. Thanks to mildew for the patch.
Fedora Bugzilla: http://bugzilla.redhat.com/show_bug.cgi?id=395911
- fixed a small memory leak that happend when PostgreSQL date formatting
was used
- corrected a debug setting that survived release. Caused TCP connections
to be retried unnecessarily often.
- added expr.c, which has some thoughts on expression implementation
- fixed a potential race condition, see link for details:
http://rgerhards.blogspot.com/2007/12/rsyslog-race-condition.html
- added synchronization class to handle mutex-operations in the most
portable way.

2007-12-12
- handled selector flushing on termination (and hup) correctly. Could lose
some information before.
- done some more hard thinking on the threading model for upcoming
enhancements
- released 1.20.1

Tuesday, December 11, 2007

rsyslog race condition fixed

There is a race-condition when running rsyslog 1.20.0 or any previous release in multi-threaded mode. The probability for it to happen is very remote, but there definitely is a design flaw in it. Quick communication revealed, unfortunately, that this flaw can not be responsible for the hard to track segfault bug. The segfault occurs in a situation that does not match what I have found out. I discovered this problem when I worked on multi-threading re-design and focussed on input modules. Maybe my decision to hold off redesign until after the segfault bug has been found was wrong. Redesign forces me to look at more places from a very different angle and that may reveal even more (at lest I hope so).

Now here is the story on the "mark" race condition I discovered today:

fprintlog() duplicates the message object when we have a "last message repeated n times". It does this by saving the pointer to the message object in a temporary buffer, carries out its work and then checks if it needs to restore the saved pointer. This works well in single threading as well as in almost all cases when running multi-threaded. However, if we reach a mark interval domark() calls fprintlog() potentially concurrently to a call that is already in place. What can happen is:
  1. domark() calls fprintlog() on an action
  2. fprintlog() begin execution and saves the previous message to the buffer
  3. fprintlog() is preempted
  4. the worker thread now calls into fprintlog() with the exact same message
  5. fprintlog() processes the message and finishes (deletes it)
  6. now processing of that thread ends and our first thread is resumed
  7. fprintlog() performs its actions and restores the already freed message object pointer
Now, the action holds an invalid pointer. Depending on what happens next, it either aborts (if the buffer has been overwritten) or continues to function but does a double-free.

The root cause is that access to the action object is not synchronized. This was deemed unnecessary, because there could be no concurrent write operations be in place. The domark() processing, however, had been overlooked.

This analysis is still preliminary, but points into a good direction. It needs to be said, though, that the probability for this failure scenario is remote. I have confirmed this is a race condition.

If you think about it, the mark() processing as a whole does not make much sense if we have a full queue. It is awfully flawed. I don't like mark(): in the original sysklogd code, there was a similar issue: mark() was called by an alarm() timer and executed the full syslogd code during its processing. Given that lead to serious problems if some other message was being processed. I solved that issue by setting just a flag() in the alarm() handler. Then, actual mark() processing was started in the mainloop(). For single threading mode that works, because no other thread can be in the action processing at that time.

In multi-threaded mode, however, the mainloop() runs on a thread different from the work thread. So in fact, domark() can once again conflict with action processing. And if the queue is full, it does totally wrong things: because it uses whatever message is currently being processed as basis for emiting mark messages. This is seriously flawed! The root cause is that mark() processing does not go through the producer/consumer queue/gateway. This is what I now need to fix.

What mark() does is first to inject the "--mark--" message. That is no problem, because it is injected via the regular producer interface logmsgInternal(). But then, it calls domarkActions() on each action which in turn calls fprintlog(). It also accesses the messages then-current f_prevcount, which, with a full queue, has nothing to do with the last element being seen at that time.

The more I look at the code, the more I wonder what exact useful feature it is. I just checked the cuttren sysklogd source, and, yes, it still is there (even that domark() is being called in an alarm() handler is still there...). Interestingly, in both sysklogd and rsyslogd the "last message repeated n times" periodic display is turned off when mark messages are turned off. Is this intentional? I doubt so...

So what did the original sysklogd author think when he wrote that post? I guess he wanted to have an indication that messages had been dropped - and this not only when the next message with different text arrived, but after a set period (30 seconds with the current defines both in rsyslog and sysklogd). So message compression should indicate at least every 30 seconds that messages arrived, but were compressed. OK, that gives me something to think about.

Obviously, there is no point in emitting the "last message repeated n times" message if we have, let's say, 100 identical message sitting in the queue followed from at least one non-identical message. In that case, the queue will be processed as fast as possible and upon arrival at the non-identical message, the "repeat message" will be issued. No need to say anything in between. If, however, there is no such non-identical message, rsyslogd is left in a somewhat open state. The queue is depleted but still no output is written (but "n" messages have not been displayed). Only in this scenario it is appropriate to start a timeout timer that will ultimately lead to the "repeated message" when no further non-identical message arrives in the allocated time window.

To be more precisely, it is not actually a question of messages being in the queue. As we have sophisticated filtering with rsyslog, the question actually is if a message was processed (e.g. written to file) by an action within the "repeated message" time window (RMTW). When the last message has been processed can be quite different from action to action.

One way to track this is to keep track when each action last successfully was called. If the queue is reasonable full and the action is supplied with reasonable, to be processed, data on a steady rate, that time should never fall outside of the RMTW. And if it does, isn't that an indication that it is time to write a "repeated message" out, so that the operator is seeing at least one indication in every RMTW? Of course it is! So we can store this timer with each action and use it as a base check.

Now, we can periodically awake and check each action object; did it last process something outside of its RMTW AND does it have any repeated messages received? If so, it is time to emit the "repeated message" message. The fine difference to existing code is that we use the newly constructed timer here. Also, the action object must be locked, so that this background process and the worker thread(s) will not access the same vital data structures at the same time. The scenario from the top of this post would otherwise still apply. Finally, this processing should be de-coupled from the mark processing, because IMHO these are two totally separate things.

I will now go ahead and apply these changes and then we shall see where this bring us.

I have now done a first implementation. Actually, the code does not look that much different compared to before the change. The difference is that I have handled the timestamp thing a bit more transparently and, the biggie, I have used synchronization macros to guard the action object against the race condition. This code will become part of 1.20.1.

A design problem...

Folks, I am facing a design problem - and it looks so simple that I am pulling out all my hair ;)

I am currently preparing the next steps in modular rsyslog redesign. I am not sure yet, there are a couple of candidates what to do first. One is to add a real expression capability, another one is to add threaded inputs (which would be quite useful). In support of these enhancements, a number of things need to be changed in the current code. Remember, we are still running on large parts of the original sysklogd code, which was never meant to do all these advanced things (plus, it is quite old and shows its age). A cleanup of the core, however, requires some knowledge of what shall be done with it in the future.

My trouble is about a small detail. A detail I thought that should be easy to solve by a little bit of web search or doing a forum post or two. But... not only did I find the relevant information, I did not even find an appropriate place to post. May be I am too dumb (well possible).

OK, enough said. Now what is the problem? I don't know how to terminate a long-running "socket call" in a proper way under *nix. Remember, I have done most of my multithreading programming in the past ten years or so under Windows.

What I want to do: Rsyslog will support loadable input modules in the future. In essence, an input module is something that gets data from a data source (e.g. syslog/udp, syslog/tcp, kernel log, text file, whatever ...), parses it and constructs a message object out of it and injects that message object into the processing queue. Each input module will run on its own thread. Fairly easy and well-understood. The problem happens when it comes to termination (or config reload). At that instant, I need to stop all of these input module threads in a graceful way. The problem is that they are probably still in a long-lasting read call. So how to handle this right?

Under Windows, I have the WSACancelBlockingCall() API. Whenever I call that method, all threads magically wake up and their read and write calls return an error state. Sweet. I know that I can use signal() under Linux to do much of the same. However, from what I read on the web I have the impression that this is not the right thing to do. First of all it seems to interfere with the pthreads library in a somewhat unexpected way and secondly there is only a very limited set of signals available ... and none left for me?

The next approach would be to have each blocking call timeout after a relatively short period, e.g. 10 seconds. But that feels even worse to me. Performance wise, it looks bad. Design-wise it looks just plain ugly, much like a work-around. It looks like I needed to do something not knowing what the right thing is (which, as it turns out, is the right description at the time being ;)).

To make matters worse, I have a similar problem not only with the read and write calls but with other constructs as well. For example, I'd like to have a couple (well, one to get started) of background threads that handle periodic activity (mark messages immediately come to my mind). Again, I would need a way to awake them when it comes to termination time - immediately.

And, of course, I would prefer to have one mechanism to awake any sleeping thread. Granted, can't do that under Windows either, so I may need to use different constructs, here, too.

This is the current state of affairs. There is still enough work to do before the question MUST be answered in order to proceed. But that point in time approaches quickly. I would deeply appreciate any help on this issue. Be it either advise on how to actually design that part of the code - or be it advised where to ask for a solution! Really - a big problem is that I did not find an appropriate place to ask. Either the forum is not deeply technical enough, or there are some mailing lists where the topic is on something really different. If you know where to ask- please tell me!

[update] In the mean time, I have found a place to ask. Blieve it or not, I had forgotten to check for a dedicated newsgroup. And, of course, there is ;) The discussion there is quite fruitful.

Wednesday, December 05, 2007

going to view another shuttle launch...

... but this time not to Florida but to the Columbus Control Center located in Munic, Germany. The details are in my space blog for those that are curios ;) The bottom line is that I am away from my development machine from tomorrow afternoon until Friday at around noon.

Please keep your fingers crossed so that I can experience an on-time launch. This time, I can not stay any longer...

update on rsyslog changes

I have been a bit lazy in posting my recent rsyslog changes. This is because there was a lot of fine-tuning as well as design work. I also fear that I probably have forgotten to log a few less important changes. Nevertheless, the log should still provide you with a good idea of what's going on. If I do not get any more last minute nits, the 1.20.0 release can hopefully be released either tomorrow or on Friday.

Here are the changes now:

2007-11-27
- got new dev environment on Fedora 8 x64 up and running
- reviewed calls to MsgDestruct() - re-confirmed that there are no
double frees.
- moved syslog PRI code tables to msg.c as this is the only place where
they are still needed

2007-12-03
- adding sur5r's postgres module - many thanks for providing it!
- added Bartosz Kuzma patches for better compatibility with NetBSD - many
thanks for them!
- disabled GSS-API in configure.ac, as support for it is not yet complete
(Peter Vrabec told me to hold it until a further patch)
2007-12-04
- released 1.20.0 preview (via mailing list only, waiting on feedback)
2007-12-05
- applied patch from Bartosz Kuzma and Michael Biebl to fix compilation
problem on NetBSD (it is missing the -ldl lib, even though it provides
dlopen()).

Monday, December 03, 2007

rsyslog now does postgres SQL!

I have just finished the integration of a new postgres sql output module for rsyslog. The code was provided by sur5r. I am more than glad that rsyslog now supports the popular postgres sql engine. I am sure this is a very nice addition for the community.

Sur5r's able-coded code also provided some feedback about the output module interface in general. The good news is that the amount of code needed in the actual output module was as low as initially thought.

But it also proved that module design still is less than perfect: there was a new date format needed to support postgres sql. This could not be done in the output module. Instead, sur5r had to modify the core files. He did this in an excellent way, but the point is that it is not a good thing this was needed at all. It is a result of the old, non-modular syslogd design which we inherited from sysklogd and have not yet changed. In fact, we are at the very beginning of modularization. The next thing on the list is a loadable function interface. With that, such modifications will no longer be necessary, a custom (loadable) function can than be provided.

This enhancement requires a number of changes in the core, most importantly in the template subsystem. This is most probably the next area of work I am heading to.

But for the time being, let's celebrate that rsyslog now has postgres sql support! It will be released in 1.20.0, hopefully around the end of the week. A preview will be announced on the mailing list somewhat earlier. And if you can't wait, simply pull it out of anonymous CVS - it's in there as of this writing.

rsyslog now part of Fedora 8

Oops... I think I forgot to mention that rsyslog was recently included in Fedora 8. This is an important milestone for the rsyslog project, because it is the first time ever that it appears as the default syslogd in a wide-spread distribution.

Thankfully, a lot of other package maintainers have been found for other distros (I really appreciate your hard work, guys!), but everywhere else rsyslog is "just" an extra. In Fedora 8, it is the default logging daemon. I'd also like to add that Red Hat folks contributed quite a lot to rsyslog, including great testing.

Thanks to everyone who helped make rsyslog this capable. I think its appearance in Fedora 8 is rewarding for all of us!

rsyslog 1.19.12 released

I just released rsyslog 1.19.12. For the minor change I made in 1.19.11, I created a branch based on 1.19.10. I thought .11 would be the last release in that branch. Obviously, I was wrong. Working on two branches is definitely a lot of not really useful extra work. I'll try to get thing together this week, then again only working with one branch. That will be the feature-enhanced 1.20.0 release.