Tuesday, September 10, 2013

imfile multi-line messages

As most of you know, rsyslog permits to pull multiple lines from a text file and combine these into a single message. This is done with the imfile module. Up until version 7.5.3, this lead to a message which always had the LF characters embedded. That usually posed little problem when the same rsyslog instance wrote the message immediately to another file or database, but caused trouble with a number of other actions. The most important example of the latter is plain tcp syslog.

That industry standard protocol uses LF as a frame delimiter. This means a syslog message is considered finished when a LF is seen and everything after the LF is a new message. Unfortunately, the protocol does not provide a special escape mechanism for embedded LFs. This makes it simply impossible to correctly transmit messages with embedded LFs via plain tcp syslog (for more information, see RFC6587, section 3.4).

To solve this situation, rsyslog provides so-called "octet counted" framing, which permits transmission of any characters. While this is a great solution for rsyslog-to-rsyslog transmission, there are few other programs capable of working in that mode. So interoperability is limited.

Even worse, most log processing tools (primarily those working on files) do not expect multi-line messages. Usually they get very confused if LFs are included.

In short, embedded LFs are evil in the logging world. It was probably not a great idea to generate them when imfile processes multi-line messages.

Starting with rsyslog version 7.5.3, this problem has now been solved. Now, imfile escapes LF to the four-character sequence "#012", which is rsyslog's standard (octal) control character escape sequence. With this escaping in place, there will neither be problems at the protocol layer nor with other log processing applications. If for some reason embedded LF are needed, there is a new imfile input() parameter called "escapeLF". If set to "off", embedded LFs will generated. We assume that when a users does this, he also knows what he does and how to handled those embedded LFs.

This behaviour could obviously break existing configurations. So we have decided not to turn on LF escaping for file monitors defined via legacy statements. These are most probably those that do not want it and also probably long have dealt with the resulting problems.

As always, it is highly suggested that new configurations use the much easier to handle input() statement, which also has LF escaping turned on by default. Note that you cannot use LF escaping together with imfile legacy config statements. In that case, you must switch to the new style.

So this construct:

$InputFileName /tmp/imfile.in
$InputFileTag imfile.in
$InputFileStateFile imfile.in
$InputFileReadMode 1

Needs to become that one:

input(type="imfile" file="/tmp/imfile.in"
      statefile="imfile.in" readMode="2" tag="imfile.in")

Again, keep in mind that in new style LF escaping is turned on by default, so the above config statement is equivalent to:

input(type="imfile" file="/tmp/imfile.in" escapelf="on"
      statefile="imfile.in" readMode="2" tag="imfile.in")

This later sample is obviously also correct.
To turn off LF escaping in new style, use:

input(type="imfile" file="/tmp/imfile.in" escapelf="off"
      statefile="imfile.in" readMode="2" tag="imfile.in")

I hope this clarifies reasons, usefulness and how to handle the new imfile LF escaping modes.

Busy at the moment...

Some might have noticed that I am not as active as usual on the rsyslog project . As this seems to turn out to keep at least for the upcomi...