Tuesday, July 31, 2007

pace of changes in rsyslog

I have achieved much the past days with rsyslog. The output module interface is more or less ready. Of course, a few things need to be cleaned up, but all in all it looks good. Maybe a day or two for the cleanup. The interface is not yet perfect. Especially the way actons are configured should be changed (to make it more secure). However, none of this is really pressing. The initial goal was to get a better to maintain code base and prepare it for future enhancement. Both of this is basically finished.

So wouldn't it be natural to go ahead and do all the nice other things? Well, I think not really. There has been much change. It doesn't hurt to let the code base mature a little before further large changes are applied. Especially when we think that it will ship as part of Fedora 8. And I have to admit I need to look at some other work from time to time ;)

With all that, my plan is to refrain from large changes in August. Of course, I'll probably can't withstand doing some improvement here and there. But I do not plan to do the loadable plugins, advanced threading or input module interface for the time being. All of that will happen starting September. By then, I think I have a very good starting point for that work. And now I hope that rsyslog get's a lot of testing (as already has happened the past weeks). I think that what we currently have will more or less 2.0.0. Let's see how far features creep in besides my good resolutions ;)

rsyslog changes on 2007-07-31

Today was a quite good day. I had time to work the full day on rsyslog and could go for the complex things. To do them, I also had to add some more plumbing, which hopefully will also enable me to do other things much quicker in the future. For example, I have added a full-fledged, keyed linked list class today. I think that this class will most probably serve me again on several occasions. It is very generic and should be applicable to many more code pathes where linked lists are needed. I will not migrate the other linked lists right away, but in the long term this will most probably happen (and resulting in smaller code size for the same functionality).

I have also improved internal error handling and support macros. I have finally arrived at a stage where there is a well defined way (based on srRetVal data type) by which all functions communicate between each other. This includes something like finalizers. It works pretty neat. Of course, lot's of code is still to be changed. But a lot is already done. I think this system also greatly improves code reliability, as all error states are consistently handled and reported to upper layers. I already see benefit from that in practice. For example, there was the problem that up until now config line parsing errors did not report line numbers. This was quite hard to obtain with the old code. Now, the high-level handler gets a neat error state and if it is an error, the high-level function spits out the line number - mission accomplished. Oh, what a joy when there is a well-designed and solid code base ;) I guess I will enjoy similar benefits more often in the future.

Here is the work log for today:

  • moved doGetGID() to cfsysline.c
  • moved umask & file/dir creation mode parsing to cfsysline.c
  • added macro for easy and consistent check of iRet return value
  • moved the SetCCEscapeCharacter config file directive to cfsysline.c and also generalized it for further use while doing so
  • added doGetInt() to cfsysline.c and adapted dynaFileChaceSize handler to use it
  • added macro to consistently define iRet
  • added macro to abort a function and go to finalizer
  • added output of config file line number when a parsing error occured
  • moved code to open config file into separate function processConfFile()
  • fixed bug in objomsr.c that caused program to abort in debug mode with an invalid assertion (in some cases)
  • moved debug printf code out of init() into its own function
  • added doCustomHdlr() to cfsysline.c - this completes implementing functions for canned handlers.
  • added a generic linked list object (files linkedlist.h/c)
  • fixed a typo that caused the default template for MySQL to be wrong. thanks to mildew for catching this.
  • prepared cfsysline.c for integration into output modules
  • changed modInit() interface to contain pointer to host-function query method
  • added interface to register a cfsysline command handler (basic functionality)
  • got the basic code in place to create an in-memory list of cfsysline handlers (omfile.c used as testing case) -- not yet in active code
  • omfile.c now uses the new table-driven cfsysline system
  • added configuration file command $DebugPrintModuleList and $DebugPrintCfSysLineHandlerList
  • all cfsysline directives now use new table-driven cfsysline system except for $ResetConfigVariables
  • $ResetConfigVariables now also works via the table-driven sytem. However, I need to fix an issue with loading default settings when syslogd is started or HUPed

Monday, July 30, 2007

a restartable interface for syslog actions...

These days, I had a quite interesting email discussion. I am reproducing it here anonymized, as I think it is probably useful to get the "big picture" of where rsyslog is heading to.

I got the following request

Btw, I am trying to understand if it is possible to create this logic:

  • specify 2 mysql servers (same schema)
  • when rsyslog detects a writing failure on primary mysql server (let's
    say after the 1st retry)
  • start logging on the secondary server.
  • rollback to primary server will be manual (sighup to rsyslogd or
    something like this)

My reply was:

thanks for the suggestion. It actually is kind of on the todo list. Later this year, rsyslog will get a restartable queue interface. That is, when MySQL (or whatever else) goes down, messages are spooled to disk. When it comes back up, the spooled messages are written. All of this will happen in sequence.

I am currently doing a big code restructuring, and one of the reasons for it is the restartable interface ;) It will be a very powerful and generic solution, but thus it will take some time. I anticipate some time around fall.

And after some further conversation, I wrote this:

Before a fully restartable interface, I'll add a capability to work with backup actions. That is something like this


*.* >database-writer
&[onfail] >backup-database-writer


The rule with [onfail] in it would trigger only if the database-writer fails. I have a similar request for tcp forwarding and I think I can elegantly integrate it into the action processor, once the output module interface is fully implemented (which, unfortunately, it not yet is ;)).


This conversation quite good communicates the upcoming ideas and its use-cases. I personally think that around September I can begin to implement it. So the current order of events probably is (in that order)

  • finish output module interface
  • implement multiple actions per selector line (actually also improves performance for multiple actions with the same filter condition - this was the initial reason to design it)
  • implement the failover mode for actions
  • implement the queued interface


Along these actions, we need to implement automatic suspension/re-enabling of actions. I am not yet sure when this happens - it probably depends on when it is needed. Another good point of this functionality would be even more simplified output module code. This also serves as motivation. Another thing that will happen some time is the loadable plug-in interface - but that will probably be quite easy once the output module interface is finished.

rsyslog changes upto 2007-07-30

I've not had as much time as I hope to have for rsyslog today. However, I also did a bit coding over the weekend. My work log is as follows:

2007-07-28
- fixed bug in freeSelectors()/stopWorker()
- added some module interface doc
2007-07-30
- released 1.17.5
- added cfsysline objects - initial set of functions
- fixed bug in OMSRcreate() - always returned SR_RET_OK
- fixed a bug that caused ommysql to always complain about missing
templates
- fixed a mem leak in OMSRdestruct - freeing the object itself was
forgotten - thanks to varmojfekoj for the patch
- fixed a memory leak in syslogd/init() that happend when the config
file could not be read - thanks to varmojfekoj for the patch
- moved skipWhiteSpace() to srUtils.c, where I think it fits better
- moved doBinaryOption() and doGetGUID() to cfsysline.c
- fixed insufficient memory allocation in addAction() and its helpers.
The initial fix and idea was developed by mildew, I fine-tuned
it a bit. Thanks a lot for the fix, I'd probably had pulled out my
hair to find the bug...

As you can see, I started on the final piece of output modules, that is handling of $-config lines (called cfsyslines). This time, I have a different, bottom-up approach. I now move the code first to the new object and then implement the object in all its glory. That costs me a bit more time for interim code that I will quickly discard, but it safes me the headache of coding hours and hours without the ability to test what I am doing (that was a big problem last friday). As I was interrupted often today, this approach proved indeed very valuable. It even allowed me to include mildew's great patch this afternoon AND immediately release it to the anon cvs. As a side-note, this approach is also the reason why there is code in cfsysline.c that so far is never executed - it is the plumbing that I will activate when I have moved all the utility functions. hopefully, that'll be tomorrow.

Saturday, July 28, 2007

NASA Kids

I am always on the search for English learning resources for kids. I guess I found another one. Me and my son are space addicts. The NASA kids homepage seems to be a great fit. It provides fun and space facts while at the same time offering some *real* motivation to dig into English.

I am still on the search for some weekly or monthly newsletters especially written for kids. And of course, I would find those great that are specifically written for non-native speakers. If you come around any such thing, please drop me a note. The topic doesn't necessarily be space or astronomy. Wildlife, sealife and anything else would also be great. I am looking forward to all the great suggestions ;) [well, honestly I think receiving even one would be great...].

on the syslogd -h option

While I work on rsylsog modularization, I also re-visit a lot of code. Please remember that rsyslog is rooted in the sysklogd package (and we always tried to keep it quite compatible with it). When I finished moving out references to the selector_t (struct filed) entries in the modules, I came across a place in the forwarding driver where the message element is accessed. You can look up that code in cvs (omfwd.c, line 597 and below).

This code implements the -h option, which stops forwarding messages when they did not originate from the local host. The intention of that option probably is to avoid a death spiral, which could be caused by two systems sending syslog messages back and forth (this scenario is actually even covered in RFC 3164, so it seems to happen from time to time...).

However, the code in sysklogd relies on hostnames to prevent that behaviour. If the hostname is different from the current hostname, then we have a remotely received message. I question if that check is always reliable (besides, it is not working right at the moment ;)). If that functionality is actually needed, it would be way better to check the messages target IP address against the local addresses (probably a lot of work, but definitely doable).

The question is, if such a feature is actually needed - and if it is needed in the output driver. To me, it sounds like a natural filter condition ("selector does not apply if host == non-localhost"). If that feature is required, it would probably be best suited to build it into filtering than into a (single) output module.

But again, the question is: do we really need to provide this functionality? Or is it an artifact long gone away?

Feedback is appreciated (you may also use the rsyslog forum, if you like).

rsyslog progress on 2007-07-27

I made big progress, even though the work log seems not to indicate it. The issues I worked on were quite complex. And, most frustratingly, there were no simple way to even compile rsyslog until the change was completed. So I hacked for about 6 hours without any feedback on the effect. Of course, after the first compile things were really bad. But over time, I managed to fix the bugs. Now I am quite happy with the result. The output module interface really begins to materialize. The next big thing is handling of configuration system line directives ($-lines). Stay tuned...

The work log for Friday:
- released 1.17.4
- added omsr object (objomsr.c, objomsr.h) - template request for output
modules
- changed doAction() interface
- templates and output string generation for doActon() is now fully
- removed selector_t f references from output modules
- MILESTONE reached: no more access to selector_t from any module, at
least at this layer we communicate via clean interfaces. However,there
remains the topic of global variable access and calling to functions
housed somewhere else (e.g. in syslogd.c). A new code review is now due,
many changes happened, many TODO's added.

Friday, July 27, 2007

yesterday's rsyslog changes

During the large rsyslog modularization effort, I take a more detailed audit log of what I am doing. I hope that this log will allow others to both follow the progress as well as help to understand what I am doing. I was not sure (and I still am not) where to post that log. I've now decided to post it to my blog, because it doesn't look really suitable for the "offcial" rsyslog site.

Please note that the work audit contains more detail than the ChangeLog. This is intentional. The ChangeLog shall provide the average user with an idea of what's now. My audit here provides a finer-grained information for those that are really interested in it.

Here come yesterday's changes. They are listed in the same order I have done them.

- applied patch from mildew to avoid zombies
- applied patch from Michel Samia to fix compilation when NOT
compiled for pthreads
- implemented onSelectReadyWrite() interface
- MILESTONE reached: no more access to f->f_un in syslogd.c
- shuffled code from tcpsyslog.c to omfwd.c. It looks like it belongs more
to that file. But we need to look at it some time later. The move was
absolutely necessary so that no access to f->f_un happened in
tcpsyslog.c (which was evil)
- MILESTONE reached: no more access to f->f_un from non-output modules
- changed doAction() interface to include module data pointer
- removed references to f_un from omusrmsg.c
- changed module template for parseSelectorAct() [code reduction,
consitency]
- removed references to f_un from ommysql.c
- removed references to f_un from omfwd.c
- removed references to f_un from omshell.c
- removed references to f_un from omfile.c
- MILESTONE reached: f->f_un has gone away!
- removed f_type from omshell.c, omdiscard.c, omusrmsg.c, ommysql.c
- removed f_type from syslogd.c/cflineParseFileName()
- fixed bug in omfile.c which could lead to invalid addressing if "-" was
given to not sync file
- removed f_type from omfile.c
- implemented needUDPSocket() interface
- replaced (mis) use of f_prevcount in omfwd.c -> now data element in
instance data is used for retry counting
- removed f->f_type from syslogd.c, omfwd.c
- removed f->f_file from omfwd.c, omfile.c
- f->f_flags is gone away
- changed doAction() interface to contain the full message string
- f_iov and its handling has been removed
- added IDs to selector_t

If you are interested in even more details, you can go to the rsyslog cvs and see the changes on a file-by-file basis.

As the day closed, I identified a problem with the current interface definition: Modules need to access the template pointer in selector_t. They may even need to have multiple templates (e.g. dynaFiles, a hypothetical email action [subject and message text]). I need to address this soon.

I hope this audit is useful. Yesterday's changes will be released as 1.17.4 this morning. Then, I continue to work on modularization.

Monday, July 23, 2007

rsyslog approved as Fedora 8 feature

Man, I was so busy, I didn't even notice that the Fedora steering committee approved the rsyslog feature for Fedora 8. The rsyslog feature page in the Fedora Wiki is an interesting read. I am quite happy with the state of affairs. Most importantly, rsyslog is receiving a lot of testing now, and new bug reports and patches come in each day. This helps to make a rock-solid and feature-rich software, just as it should be.

Sunday, July 22, 2007

English for Kids...

I am always on the search for new material on English for kids. This is what I found today:

http://www.britishcouncil.org/kids-parents-books.htm - this is a good reference site, also with material itself.

Penguin Readers - this is graded material, also for young kids

Oxford Bookworms Library - another well-received graded library

rsyslog output module interface

Rsyslog's output module interface begins to materialize. I have even begun to restructure the code modules, which currently mostly means shifting code to different places. However, there is much more behind this code-shuffling. I've been thinking quite a while about modularization of rsyslog. What happens now is the result of this thinking. In the end, we will have output modules running on independent threads, each being able to queue data when the output for some reason is suspended (e.g. the remote syslog server it sends data to is unavailable). And, of course, the module interface will also support plug-ins.

The current MySQL action will become such a plug-in. I need to adopt a way to tell current users a way to migrate to the loadable module interface. I guess, I'll add a dummy statement like

$ModLoad MySQL

To the current configuration. Well, yes, let's do that - I've created a feature tracker as I write down this blog entry.

The only effect it will have in current code is that it tells the config engine that the user cared about modules. In builds that will later support loadable modules, it will actually load the mysql plugin. Currently, its only function will be to warn users to apply it, when they did not do it. That should take care of a smooth transition.

Thursday, July 19, 2007

astronomy talk for kids

Live is not just about programming. Today, I took some time off to give an astronomy talk to elementary school kids. Their teacher had approached me some time ago and asked if I could give that talk at the end of their school excursion. Of course I could :)

It was a quite basic talk about the sun, moon and stars, with a focus on understanding our place in to solar system. Of course I covered all the nice planets and especially focused on Saturn (of course, because I am a SOC member ;)). We had big luck with the weather. Around noon, there was pouring rain and clouds, clouds, clouds. When I arrived at the school (they came over to our elementary school in Grossrinderfeld), the rain ceased somewhat.

We prepared and off went the talk. The kids were very interested and obviously had fun. And, believe it or not, by the time the talk completed, there was bright sunshine. So I could bring out my PST solar telescope and the kids could have a great look at our mother star (of course, the teachers liked it to).

To conclude the event, I dispersed NASA stickers (ESA doesn't provide me some ;)) and left a lucky crowd.

Did I say this is a very rewarding activity? ;)

Wednesday, July 18, 2007

rsyslog now in fedora 8

Read this mail message:

http://www.redhat.com/archives/fedora-devel-list/2007-July/msg00941.html

Rsyslog is now part of fedora 8. I hope that a lot of folks will try it out and contribute their experience back. This will definitely help us build the best syslogd ever.

As a side-note, I've today received another couple of patch contributions for rsyslog. It's really fun seeing all those people look at the project. Thanks to everyone.

Changed the scope of this blog...

I started this blog in 2003 as something purely about syslog. At that time, I also started some other blogs, which I now mostly have abandoned. The whole system was a bit clumsy, plus there was limited interest in the content (to phrase it mildly).

Things have evolved. Now blogger offers tags (called "label" there), which allow to keep track of different topics in a single blog. So I thought I give it a new try. I'll use the blog in addition to my web sites and post small news here that do not make it onto the sites for whatever reason. I am not sure if that will finally work out, but it is at least worth a try.

rsyslog and fedora 8

I looks like more and more sites pick up the news. A quite good review of it, together with the other F8 features, can be found at this site:

http://linuxupdate.blogspot.com/2007/07/proposed-fedora-8-features.html

I wonder if traffic will go up on the rsyslog site - this has not happened yet. The root question behind that question is if people really care about logging. I guess most people couldn't care less than which syslogd is included in a distro. But I may be proved wrong.

In the mean time, I hope that the additional exposure will lead to more ideas, more testing and in the end result an even better rsyslog.

Tuesday, July 17, 2007

rsyslog 1.17.0

I had an extraordinary good day today. I managed to make rsyslog 1.17.0 nearly feature complete for the upcoming 2.0.0 release. I also see much more interest from the community, now that red hat thinks about moving rsyslog into Fedora 8. If all goes well, we'll soon see it pushed to F7 as an extra. That would enable more people to play with it. And that's important. We had a lot (a real looooot) of changes the past weeks. It's now time that some folks seriously try to break the code, so that we can create a solid set of bug-fixes (not that I or my contributors made any mistakes - but you know: all these multi-bit errors messing up the source ;)).

Let's stay tuned how things evolve. For the time being, will simply enjoy being satisfied ;)

Sunday, July 15, 2007

rsyslog and fedora...

The news is creeping around the net that rsyslog possible replaces sysklogd in Fedora 8. As far as I know, no definitive decision has jet been made by the Fedora folks. But a lot of them already contribute to the code and its packaging system. So no matter how they'll finally decide, this is very good news for the project.

Currently, I have not "advertised" the fact because I think it is not yet finally decided. But now it looks it has left the mailing Fedora mailing lists:

http://liquidat.wordpress.com/2007/07/14/smolt-statistics-fedora-8-feature-list

Friday, July 13, 2007

Why GPLv3? (rant on crippled commercial products with Open Source in them)

After writing about my intended move to GPLv3 yesterday, I was asked if that would really be clever to do it. Even more, I was pointed at http://www.builderau.com.au/blogs/syslog/viewblogpost.htm?p=339270811, which thinks GPlv3 is bad because it does limit what commercially can be done with the software.

Actually, I think that is a strength of GPLv3. I devote time for my projects. This time does not bring me any money. It's fun, sure. It brings insight. Sure. Probably it has some side-effects, that fund the project somehow. Great. But I would not like to see someone taking my hard work and making money out of it without contributing back. I would not like to see rsyslog running on e.g. TIVO while that very same TIVO cripples my freedom and takes money from me. Why should I contribute to something that I do not like? Am I supposed to be a fool?

No folks, forget about it. Does that mean TIVO can not build products? NOT AT ALL! They are free to do whatever they want. They can even hire me to write a subsystem for their use, without any GPLv3 restrictions. All they are not permitted to do is use my work for free without contributing back. Much the same as I am not allowed to share my legally recorded videos. Why should I provide code for such? Again, they are not limited to do whatever they like to do. The music and movie industries as well as their helpers are marauding us and torturing even paid use with digital restrictions management (DRM). They obtain well enough money to do their own development. This is the way they should do it. And if they want something for free, then they need to contribute back. It's that easy - and this is why I am a big fan of GPLv3 (yes, I know there will be subtle issue, but anyhow - these will be sorted out over time).

Now replace TIVO with "your-favorite-crippled-crap" and you got the full message ;)

Have fun!
Rainer

Thursday, July 12, 2007

GPLv3 and rsyslog

Did you know? GPLv3 is out. And I am seriously considering it for my rsyslog project. Why? I do not like Tivo-ization nor do I like software patents. So, isn't then moving to a license unfriendly to those a good idea? I think it is. But of course, there are a number of subtleties that I need to check.

I guess version 2.0.0, due soon, will be released under GPLv3.

Why are there so few messages from sysklogd itself?

Have you ever wondered why your logs do not contain anything from the syslog subsystem itself, except for maybe a message or two? Tina Bird has started an interesting new discussion on the loganalysis mailing list.

Of course, I couldn't stand it and have added my 2cts. I'd like to reproduce it here in the blog, too:

> I have received a number of responses along these lines, obtained by
> grepping the source code or by running strings on the binary.
> These are far
> better than nothing, and I'm grateful for the help, but they miss an
> important piece of the picture. Especially in a piece of code
> as old and,
> uh, crufty as syslogd, there's a high likelihood that many of
> the errors
> find themselves at the far ends of code paths that rarely (if
> ever) get
> executed, and therefore those errors never find themselves in
> the "outside"
> world, providing assistance (or confusion) to system administrators
> everywhere.

OK, I've once again done a real review of the sysklogd 1.14.1 source. I wanted to make sure I really tell the truth. The plain truth is that it is nearly impossible that anything goes wrong after syslogd is started. So you'll observe a number of "config file invalid" messages, but only (hopefully;)) during initial setup. Once things run smoothly, you will see error message only when things go really wrong, e.g. when the hard disk dies. But then, in practice, will that ever occur? If the answer is yes, then you need to ask "will it be seen"? Of those systems where a hard disk failure is catastrophic, all of the logs are probably on that failed hard disk. Yes, exactly that disk our error message will be ... ahem would be ... written to ;) So you end up with just initialization and termination messages.

Is that the case because syslogd is such a perfect piece of software. Not really. The reason is that the stock implementation simply can not have any real problems once it runs: selector lines were either OK (and operating) or invalid (and disabled). And how about the network? Surely received packets are a trouble source. Formatting errors of all kind...

Let's have a look at (informational RFC 3164):

##
4. Packet Format and Contents

The payload of any IP packet that has a UDP destination port of 514
MUST be treated as a syslog message.
##

Sweet - anything that is destined to 514 is a syslog message. No matter what the content is. Really? Am I kidding? Let's read on:

##
Example 2

Use the BFG!

While this is a valid message, it has extraordinarily little useful
information.
##

Yeah... This is a valid message. This also: "HaHaHa". So how will a parser need to complain when it processes the message. It doesn't - and that's why you won't see many messages from sysklogd itself.

HOWEVER, things are improving. In rsyslogd, there are a lot more things that can go wrong. For example, IETF is standardizing the frame format if TLS is used. This provides a number of opportunities for emitting error messages. TCP itself gives ground to another set of messages. On the output side the same: rsyslog can do dynamic file names. That means files are created depending on incoming messages. Of course, things can go wrong here, providing another set of error messages.

I am talking about rsyslog, because I maintain this project. I think any other modern-day syslogd has a similar set of error messages. And these are possibly seen in practice. But now it is much more depending on how valid all parts of the system, including senders, work. With the majority of syslog-enabled applications still following the "I don't need to obey any format" paradigm, the typical cause for error messages is not-existent for syslog servers.

I hope that clarifies. And there is even hope: syslogd's will spit out more errors in the future ;) [and, yes, I have at least created a todo item to emit meaningful error identifiers together with them...]

Rainer

Is syslog blogging worth it?

Hi all,

I've not posted anything for a looong while. I was (and I am) dissatisfied with the amount of traffic (or better non-traffic) that the blog showed. Anyhow, I think I am giving it a new try. Probably this blog will also morph into a rgerhards blog, not necessarily being on syslog only (which seem to be a too-boring topic for blobs). But I am still unsure if it makes sense to continue - drop me a note if you have an opinion.

Rainer

rsyslog 8.31 - an important release

Today, we release rsyslog 8.31. This is probably one of the biggest releases in the past couple of years. While it also offers great new fu...