Monday, March 10, 2008

On rsyslog design philosophy, plugins and complexity

There was an interesting discussion spawned on the mailing list. So far, I wrote two quite elaborate responses, showing much of rsyslog internals. For now, I am reproducing them here in the blog so that they hopefully will not disappear deep inside mailing list archives. I hope to do a summary somewhat later (as time permits). In the mean time, my apologies for the somewhat hard to read format.

This was the thread starter, and it of course expresses a very valid concern:

On 2008-03-08, Rainer Gerhards <rgerhards@hq.adiscon.com> wrote:
> Just to facilitate things: this is what I am thinking about:
>
> http://www.rsyslog.com/doc-imfile.html
>
> It can monitor all application log files natively. Did I understand your need right?
>

Seems like "imfile" could be trivially implemented trough:

while sleep $InputFilePollInterval ; do
logtail $InputFileName $InputFileStateFile | \
logger -t $InputFileTag -p $InputFileFacility.$InputFileSeverity
done

http://debian.stro.at/bzr-test/logtail/

Or for realtime monitoring:

tail --follow=name --retry $InputFileName | \
logger -t $InputFileTag -p $InputFileFacility.$InputFileSeverity

IMHO rsyslogd might be trying to do too much internally. When will it start
reading mail ? That might not sound too far fetched.. :-)

"Every program attempts to expand until it can read mail. Those programs
which cannot so expand are replaced by ones which can." -- jwz

-jf


My initial response was this:

Hi all, and especially JF and Michael,

Good discussion, please keep the thoughts flowing :)

Let me start by tell you I am guilty of a much broader view of what a
syslogd can do than most people probably have. To me, a syslogd
processes *system events* and that's much more than just those pieces
sent via syslog. I know that I should probably stopped the rsyslog
project at some point in time and fork an eventlogd from it. But, why
would this have made any sense? To me, the core point is maintainability
of code and the freedom for people to use only (and exactly) what they
need (and want!) to use. After all, there are also lots of folks how say
that database support has nothing to do with a syslogd and thus should
not be part of if (and that was the birthplace of rsyslog... ;)).

So for me the natural thing was to make rsyslog pluggable AND keep only
in the actual core what is ultimately needed. I got very serious about
this last summer. There are a couple of posts from last August in my
blog and I'd like to highlight this one:

http://rgerhards.blogspot.com/2007/08/on-importance-of-plug-ins-for-rsyslog.html

We have now reached a point where we have a quite well implemented
plugin interface and rsyslogd has become much more modular. Still, there
are things that people need to run even if they never need them (like
the syslog-protocol message parser and a couple of "standard" outputs
that come with sysklogd). I am continuously working to remove these
dependencies. For example, on Friday I moved regex support out to a
separate module. So you no longer need to load regexpes if you do not
need them.

Now to the plugins.

What is a plugin?

Good question. A plugin is *a project in itself*. Just for convenience,
it is part of the main tarball (we had a couple of real headaches when
we tried to separate the initial plugins and came to the conclusion that
it is by far more convenient to deliver them as part of the main
tarball). Plugins need not be written by the rsyslog team. Maybe
somebody has already written an email reception plugin and just not told
us - may very well be (though I doubt ;)).

Now let me take on the imfile example. The key point - at least IMHO -
is that there is no single line in rsyslog core's code that has been
added in support of imfile. And, more importantly, if imfile would go
away, not a single line of code could be removed. So the imfile plugin
(project) does neither add complexity nor code nor other overhead to
rsyslog core. So can it be evil?

Imfile was added in response of a user request. When I heared it, I
thought it would be useful to have such a plugin. Did I know about the
logger approach? Yes - and I have read some many folks complain over
this or that aspect of a logger-approach that I knew I didn't want it.
Maybe these were all false claims, maybe people simply did not
understand to do it right. I don't know and I honestly admit that I
don't care. The presence of many people feeling bugged about that
approach made me not like it (if you look for actual samples, I suggest
going through the loganalysis mailing list archive).

Did I know about logtail? No. Had I written imfile if I did? Not sure:
from the small shell script you provided, it looks like there is a
problem if

a) script is in sleep period
b) data is appended to text file
c) text file is rotated
d) new lines are written to text file
e) script awake for new polling loop

On a quick look, it looks like the data written in b) will never make it
to the syslogd. Imfile handles that. But even if logtail (or the script)
handles it - as I said, I did not know about it. So I took roughly a day
of my time to create an initial imfile. If logtail handles the situation
described above, would be available on the platforms I was asked for and
I knew about it, I'd probably refrained from implementing imfile.

But, again, no harm happened. If you don't like imfile, simply don't use
it. It adds nothing to the rsyslog core. You can still use the logtail
approach. IMHO this is what we want: freedom of choice. And rsyslog
provides this freedom.

On the mail output case (though I need to be a bit brief as dinner is
approaching ;)): I actually intend to add an email output plugin. It's
on the list for long a time. I personally see some benefit in the
ability to talk to a tightly coupled, in-memory, very performant
accessible way to send mail message (just like with the SNMP traps). On
the other hand, you can see that I didn't implement it for a long time -
simply because I did not consider it important enough given what else
needs to be done. But I now need to speak up, because I have actually
begun to seriously look into implementing it before this discussion
begun ;) Before I do, I will make sure I understand that other options
and if there is some benefit in doing so. But if I implement it for one
reason or the other, nothing bad happens to rsyslog: if you don't like
it, you don't need to use it. It doesn't add any overhead to the core.

And now please feedback. Am I overlooking something real evil? What is
the bad thing in offering exotic functionality that only few will ever
need?

Thanks again for the discussion, I am *really* interested in getting
more feedback on this topic.

Thanks,
Rainer

Which triggered another message. I am just reproducing my response, because it quotes the original message:

> > Now let me take on the imfile example. The key point - at least IMHO
> -
> > is that there is no single line in rsyslog core's code that has been
> > added in support of imfile. And, more importantly, if imfile would go
> > away, not a single line of code could be removed. So the imfile plugin
> > (project) does neither add complexity nor code nor other overhead to
> > rsyslog core. So can it be evil?
>
> It's evil that you're forcing me to upgrade to rsyslog v3.x to take
> advantage of it ;-) That's what triggered my previous post.. I want
> to be tracking non-syslog logfiles, and if imfile/plugins where more
> in the unix philosophy of small tools that chain easily.. I might have
> been able to pick it from v3.x and use it on stable systems.

I've see the smily but let me do a couple of comments. First of all, the
"philosophy of small tools that chain easily" implies that they chain
via a pipe. This is great for a lot of applications, but it has its
drawbacks. A plain pipe is a simplex, relatively loosely coupled IPC
method. So if one part of the pipe dies, other parts will learn about it
eventually, but not at the same instant and, most importantly, they do
not exactly know what was processed and what not. It's pretty much the
same thing as plain TCP transport, which pretends reliability but still
has a few windows of exposure where message loss may occur (see
http://www.monitorware.com/en/workinprogress/selp.txt section 2.4 and,
yes, rsyslog is victim to this as long as we don't have a full RFC 3195
implementation).

So relying on plain pipes is not exactly my premier communications
method if I would like to implement a reliable syslogd (and "reliable"
is the "r" in rsyslog). One can circumvent the problem by defining an
app-level protocol on top of the pipe, working with app-level acks.
HOWEVER, with that you would run into the pretty same situation, namely
that you could not use a new version together with an old engine. One
can circumvent that, too, but only as far as the old engine has
implemented such methodology. And v1/2 do not have this (due to time
constraints, rsyslog is still only about half finished...).

There is some other reason that makes me avoid pipes. If I do an
app-level ack, I need to do at least 4 system calls to pass a single
message:

1. input write to pipe
2. engine read msg from pipe
3. engine post processing stat (ACK) to pipe
4. input read ACK from pipe

If you do a little bit of math, you'll see with how many user/kernel
space transitions you end up, plus how many cycles are needed to run the
necessary housekeeping code inside the kernel and libraries. I don't sum
them up now, but I am pretty sure that I can completely process the
message in less time than is need just for the IPC in that case. That
probably is fine if you look at a low-end workstation syslogd, but you
don't want to have this overhead if you aim for an enterprise solutions
capable of handling massive data input.

Also, if everything goes to the system log socket, there is little you
can multithread. But we need to be able to multithread as much as
possible. If I write a new engine (as I currently do), I would like to
see it well working for at least the next 10 years. If I think about
hardware trends during that period, it is clear that a single core will
become not much faster than it is today. But the number of cores will
greatly increase. In order to utilize that, an application must be able
to run on as many threads as possible - all with reasonable overhead, of
course. So my conclusion is that rsyslog must be able to run massively
multithreaded for the high-end use case. This also prohibits using
primary interfaces which cannot easily be multithreaded.

Finally there is the issue of flow control. Rsyslog *does* flow control,
and will do more advanced flow control in the future. Especially with
world-dominating UDP syslog it is vitally important to do flow control,
because UDP cannot be flow controlled. Sound strange? Well... If we can
not flow control UDP syslog, we need to apply intelligent and adoptable
flow control that flow controls sources who can (tcp syslog, rfc 3195
and of course file data!) so that buffer space is kept for those
precious UDP message which are lost if we can not buffer them at the
right instant. If you think this through, you'll see that this requires
different level and methods of flow control, depending on the source [so
far, I see three levels: can not (UDP), can somewhat (TCP, local
sockets), can easily (log files and other sources that generate data
themselves)].

Having a native interface greatly reduces code complexity and thus
program reliability when it comes at implementing these features. Also,
it would require specialized plugins in any case, you couldn't do it
with a simple "pipe me in approach" (well, ... some things yes, but at a
complexity cost).

Also, while I too believe in the Unix approach of small tools, I also
think it is important that the average user is able to configure it.
Rsyslog aims not only at being enterprise-class but at the some time
aims at being easy to use for the novice. Novices don't understand
complex scripting to get the job done. I think relying too much on
complex glue doesn't help getting the job done.

To come back to your orginal post, rsyslogd *should* of course support
the Unix way of piping. It looks like there is a small plugin missing to
read natively from a pipe. However, I never got a request to implement
it. I guess most people use the logger trick to accomplish that task. I
know syslog-ng can natively read from pipes, but have not yet considered
this important enough given the lack of requests from the community. If
you like such a plugin, it's probably a good idea to speak up now ;)

> > from the small shell script you provided, it looks like there is a
> > problem if
> >
> > a) script is in sleep period
> > b) data is appended to text file
> > c) text file is rotated
> > d) new lines are written to text file
> > e) script awake for new polling loop
> >
> > On a quick look, it looks like the data written in b) will never make it
> > to the syslogd. Imfile handles that.

> Yes, you're right. Cool that imfile handles it.

> > On the mail output case (though I need to be a bit brief as dinner is
> > approaching ;)): I actually intend to add an email output plugin.
>
> The quote was about being able to *read* mail:

I was replying here to Michael Biebel. It looks I was a bit too much in
hurry to point this out. Sorry...

> "Every program attempts to expand until it can read mail.
> Those programs which cannot so expand are replaced by ones
> which can."
>
> and I was thinking it not too far fetched since "splunk" can do it (it
> can
> download email messages via IMAP, index them and create alerts on
> suspicious
> content).

In short: receiving email is very low on my agenda. Keep in mind that I
already have architecture and at least partly written such a beast on
Windows:

http://www.monitorware.com/en/Product/product_comparision.php

The email question never was in much demand. But if demand comes up, it
for sure is not a big thing to add it... (and, of course, you already
can do it today with a bit of scripting, the right mailbox rules and
logger -- but that isn't appealing to most folks and is one reason I
tend to write plugins ;)).

> It might seem like you want to take rsyslog in that
> direction, i.e.
> your complete eventlogd&alertSystem fork of rsyslogd that can read any
> input
> (syslog, other-logfile, email, snmptraps) and analyze and alert on the data.
>
> That's not what *I* want from a syslog server. I just want it to
> reliably
> collect and store the logs in an organized manner. Then I'll use other
> tools to read and analyze them.

... and this is of course perfectly fine with me, too. However, if you
look at the core engine needs, you'll see that the "do it all"
eventprocessor and the "plain simple syslogd" have exactly the same
needs - at least if you would like to extend the syslogd to be
enterprise class.

Take the queue engine in v3. Its complex. Actually horribly complex. I
didn't like to include that complex beast, but it was the only clean
solution to the need of being massively concurrent AND being able to
queue data while a destination is down. Any other alternative IMHO would
have been dirty and hackish - and in the long term much less
maintainable. So I just did the right thing (hopefully), even though it
was a really big effort and even though it probably will need a few more
firedrills before it is really ready for prime time in all scenarios.
Another alternative would have been to use one of the big enterprise
class message queueing projects. But that would have created a
dependency for such a system on each desktop - ouch... I hope I made the
right compromise. Currently, the full queue engine is part of the core.
If that turns out to be a problem, I can outsource that to a plugin, but
that isn't currently very appealing to me. After all, it's "just" some
memory overhead - if you don't need the advanced features, no code is
executed to do that. The queue uses an internal driver model and simply
configuration means simple code. Thus, the v3 queue engine is as
reliable as v2, except if you use all those bells and whistles where I
am sure currently a few bugs wait to be detected (even though the
situation has much improved recently and will improve with each new
feedback I receive).

Another good example is the config file: of course, there is no need to
have a scriptable configuration for a simple syslogd. But while thinking
about the (necessary) expression support and a lot of user requests for
a better to read config file format, I came to the conclusion that
creating a scriptable format is actually the right route to take:

http://rgerhards.blogspot.com/2008/02/introducing-rainerscript-and-some.html

Anything else (IMHO) would again be less clean, less maintainable and,
in this case, would even take longer to implement. So one might think it
is evil to include a virtual machine inside a syslogd, but to me it is
actually the least effort to implement things.

Of course, you can rightly argue that all of this is over the board if
you just want to have a plain local logger that takes messages from the
local log socket, maybe UDP syslog and store it to local files. You are
probably right. But in this case, you can still continue to use
sysklogd. After all, rsyslog was spawned from it to create and advanced
syslogd. So it comes at no surprise that I am adding features which may
not be required for the simple use cases. ;)

Let me conclude with two core points:

- rsyslog core is as slim as possible, plugins are separate projects
that extend the core; This means nobody is forced to run more code
than actually required for his job

- both a simple, but fairly enhanced syslogd as well as a full network
Event processor share the same root engine needs

Thus, rsyslog implements this core engine and I occasionally add a
plugin here and there to take advantage of the core. Right now, rsyslog
core is far from being finished, as are the plugins. At this time, I am
working on getting the core right and doing the most requested plugins.
When I am done with that, I'll look at the *real* advanced plugins for
all kinds of things that users have interest in. I don't see any need to
fork of a separate core project for that. In fact, I think it would be
counter-productive as I would need to maintain two code bases and the
newly forked project would always be able to do what rsyslogd does. So
what would be the motivation to maintain another, feature-less
project...?

Anyhow, I may be totally wrong. Feedback on this topic is still highly
appreciated (be it brief or elaborate ;)). I am probably abusing this
thread to also tell you a bit about design decisions I have not yet
communicated (sorry for that, so much to do, so few time...;)).

... what concludes the current state of discussion. I hope it is a useful read. And, of course, you are invited to voice your opinion!

No comments:

Automating Coverity Scan with a complex TravisCI build matrix

This is how you can automate Coverity Scan using Travis CI - especially if you have a complex build matrix: create an additional matrix en...