Thursday, July 12, 2007

Why are there so few messages from sysklogd itself?

Have you ever wondered why your logs do not contain anything from the syslog subsystem itself, except for maybe a message or two? Tina Bird has started an interesting new discussion on the loganalysis mailing list.

Of course, I couldn't stand it and have added my 2cts. I'd like to reproduce it here in the blog, too:

> I have received a number of responses along these lines, obtained by
> grepping the source code or by running strings on the binary.
> These are far
> better than nothing, and I'm grateful for the help, but they miss an
> important piece of the picture. Especially in a piece of code
> as old and,
> uh, crufty as syslogd, there's a high likelihood that many of
> the errors
> find themselves at the far ends of code paths that rarely (if
> ever) get
> executed, and therefore those errors never find themselves in
> the "outside"
> world, providing assistance (or confusion) to system administrators
> everywhere.

OK, I've once again done a real review of the sysklogd 1.14.1 source. I wanted to make sure I really tell the truth. The plain truth is that it is nearly impossible that anything goes wrong after syslogd is started. So you'll observe a number of "config file invalid" messages, but only (hopefully;)) during initial setup. Once things run smoothly, you will see error message only when things go really wrong, e.g. when the hard disk dies. But then, in practice, will that ever occur? If the answer is yes, then you need to ask "will it be seen"? Of those systems where a hard disk failure is catastrophic, all of the logs are probably on that failed hard disk. Yes, exactly that disk our error message will be ... ahem would be ... written to ;) So you end up with just initialization and termination messages.

Is that the case because syslogd is such a perfect piece of software. Not really. The reason is that the stock implementation simply can not have any real problems once it runs: selector lines were either OK (and operating) or invalid (and disabled). And how about the network? Surely received packets are a trouble source. Formatting errors of all kind...

Let's have a look at (informational RFC 3164):

##
4. Packet Format and Contents

The payload of any IP packet that has a UDP destination port of 514
MUST be treated as a syslog message.
##

Sweet - anything that is destined to 514 is a syslog message. No matter what the content is. Really? Am I kidding? Let's read on:

##
Example 2

Use the BFG!

While this is a valid message, it has extraordinarily little useful
information.
##

Yeah... This is a valid message. This also: "HaHaHa". So how will a parser need to complain when it processes the message. It doesn't - and that's why you won't see many messages from sysklogd itself.

HOWEVER, things are improving. In rsyslogd, there are a lot more things that can go wrong. For example, IETF is standardizing the frame format if TLS is used. This provides a number of opportunities for emitting error messages. TCP itself gives ground to another set of messages. On the output side the same: rsyslog can do dynamic file names. That means files are created depending on incoming messages. Of course, things can go wrong here, providing another set of error messages.

I am talking about rsyslog, because I maintain this project. I think any other modern-day syslogd has a similar set of error messages. And these are possibly seen in practice. But now it is much more depending on how valid all parts of the system, including senders, work. With the majority of syslog-enabled applications still following the "I don't need to obey any format" paradigm, the typical cause for error messages is not-existent for syslog servers.

I hope that clarifies. And there is even hope: syslogd's will spit out more errors in the future ;) [and, yes, I have at least created a todo item to emit meaningful error identifiers together with them...]

Rainer

No comments:

The clang thread sanitizer

Finding threading bugs is hard. Clang thread sanitizer makes it easier. The thread sanitizer instruments the to-be-tested code and emits u...