Wednesday, April 09, 2008

Why is native email capability an advantage for a syslogd?

Following up on my post on rsyslog's new native email capability, an interesting conversation arose. I'd like to share it with you:

> > I promise to listen very carefully and try to implement anything that is
> > doable and makes sense in the rsyslog context.
> >
> One thing springs to mind - I think "sendmail" support is more important
> than you give it credit.
>
> What if you've got an alert rule in rsyslog to email you when your
> network link fails - but your SMTP server is at the other end of the
> link? :-) If you used sendmail - you get requeuing and retrying for free
> - I don't think you want to have to add that to your SMTP support...

Well, that's actually not an issue at all in rsyslog. The rsyslog core
engine is reliable [to be precise: can be configured to be reliable,
it's not by default] in a way that exactly handles this situation. In
rsyslog, any action, including now mail, can run on its own queue. When
an action fails, it tells the rsyslog core that it could not
successfully complete. Then, the rsyslog core schedules retries until it
finally succeeds. While doing so, the messages are kept inside a queue.
This queue is in memory as long as that's sufficient and is moved to
disk if there is demand (e.g. rsyslog shutdown, running out of
configured in-memory queue space). A sample of such a configuration
(this time with the database writer), can be found at:

http://www.rsyslog.com/doc-rsyslog_high_database_rate.html

Bottom line - rsyslog is designed to work with failing destinations and
automatically recover these. So there is nothing special needed to make
it handle a failing smtp connection.

In fact, I consider the SMTP direct mode more reliable than the sendmail
mode, exactly because of that feature. With sendmail, I hand over the
message to an external entity but do not know if delivery succeeded.
With SMTP direct, I know at least it made its way to the SMTP sever.
Granted, I don't know if the SMTP server will ultimately deliver it, but
I have a bit more control over what's going on.

For example: rsyslog also has a mode where it can use backup actions if
things fail (after n retries). So let's consider the example above.
Let's say we have an urgent alert, but the smtp server is down. With
sendmail, I hand the message over to sendmail but do not know that
sendmail actually queues it. With smtp direct, I *know* that the smtp
server is unresponsive. Depending on the urgency, I may either do a few
retries or I may immediately switch to another delivery method. For
example, I may than go to try SNMP. Or I may do another email action in
this case and try to contact a email-to-sms gateway so that this can be
delivered.

Please note that in rsyslog one can have multiple actions chained
together. So a probable scenario to handle such a case could be

1. try to email via the corporate server
2. if that fails, try to email via a public gateway
3. if that fails, start a program to do some automagic action

All of this is possible because of I do not use sendmail. But, again, I
of course do not know if the mail server I used with rsyslog succeeds in
its delivery attempt. One weak spot always remains ;)

To use yesterday's sample, one could use a backup SMTP server with just
a little bit of configuration as follows:

$ModLoad ommail
$template mailSubject,"disk problem on %hostname%"
$template mailBody,"RSYSLOG Alert\r\nmsg='%msg%'"

# primary action
$ActionMailSMTPServer mail.example.net
$ActionMailFrom rsyslog@example.net
$ActionMailTo operator@example.net
$ActionMailSubject mailSubject
# make sure we receive a mail only once in six
# hours (21,600 seconds ;))
$ActionExecOnlyOnceEveryInterval 21600
# the if ... then ... mailBody mus be on one line!
if $msg contains 'hard disk fatal failure' then :ommail:;mailBody

# begin backup action, carried out if primary fails
$ActionExecOnlyWhenPreviousIsSuspended on
$ActionMailSMTPServer mail2.example.net
$ActionMailFrom rsyslog@example.net
$ActionMailTo operator@example.net
$ActionMailSubject mailSubject
$ActionExecOnlyOnceEveryInterval 21600
& :ommail:;mailBody

4 comments:

janfrode said...

Sorry, I don't buy it.

If your network connection to your smtp smarthosts is down (or maybe they're just busy telling you "'421 Too many concurrent SMTP connections; please try again later"), rsyslogd will fail to send email. And you will have to do something "automagic" to solve the situation.

With a sendmail (or other local mailer) solution, you would configure sendmail to deliver to a set of smarthosts. If all smarthosts are down, the message would be delivered as soon as one of them is up again.

/me thinks the sendmail solution is much better. The mail will always come trough. Also, if the alert is too important to trust smtp -- I would want to utilize multiple alert mechanisms at once. Not conditionally.

Rainer said...

Hi Janfrode,

I am not sure if you have actually read the sample ;)

rsyslog knows how to retry - just think about TCP delivery. We always must retry if a remote host is down. So for a retry, I don't need anything special - just the right configuration. If you just want to retry, you could use this one:

# SMTP with retry
# An on-disk queue is created for this action. If the remote host is
# down, messages are spooled to disk and sent when it is up again.
$WorkDirectory /rsyslog/spool # where to place spool files
$ActionQueueFileName uniqName # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueType LinkedList # run asynchronously
$ActionResumeRetryCount -1 # infinite retries if host is down
$ActionMailSMTPServer mail.example.net
$ActionMailFrom rsyslog@example.net
$ActionMailTo operator@example.net
$ActionMailSubject mailSubject
# make sure we receive a mail only once in six
# hours (21,600 seconds ;))
$ActionExecOnlyOnceEveryInterval 21600
# the if ... then ... mailBody mus be on one line!
if $msg contains 'hard disk fatal failure' then :ommail:;mailBody

I hope this sample goes through, it's not nice to do that with blogger comments ;)

Of course, you can retry with sendmail rules. But than you lose the capabilities I described in my post, namely the ability to switch to different alert mechanisms depending on the outcome.

But if you don't need that, the mail will always come through. Again, I am not against implementing sendmail functionality (thus I mentioned it in the first place), I just think we have far more important things to do than to implement it. Especially as IMHO the sendmail solution is far less capable than SMTP direct mode - but I agree, the later depends on what one personally likes. For the average folks not setting up complex rsyslog scenarios, sendmail is simpler to use.

janfrode said...

I read the sample, and it didn't do more than implement a backup-smarthost functionality. The same as you would get if the local mailer pointed the SMARTHOST at name with multiple MX-records.

We're probably back to philosophical disagreements again, but -- wouldn't it be better to support alerting trough external commands instead of natively implementing support for all alerting-protocols?

$ActionExternalCmd "/usr/local/bin/simplemailer-with-sensible-exit-codes.sh this is the message"

and use the same to alert via snmp:

$ActionExternalCmd '/usr/bin/snmptrap -v 1 -c public snmpreceiver.example.net enterprises.14048.99.100 "" 6 2 "" sysDescr.0 s AlertFromRsyslog'

Rainer said...

The original sample didn't implement anything else simply to get a simple sample. It could have done much more - that's the point. I think you should read the sample in the light of the description above it, which describes where I see the advantages.

But you are right, we are back in philosophical disagreement. Anyhow, I fully back your thought of a generic alerting functionality. You are absolutely right. But that is already available via the "exec program" action ;)

It's not really nice and needs some (considerable) more work, but you could do this today. You could even use it to send mail. I don't like the way it currently is implemented and it is scheduled to (probably) a full rewrite.

However, I needed mail. A lot of folks ask for a simple way to send mail - without relying on external scripts. So I decided to quickly write ommail. I initially went with SMTP direct because that was the quickest thing to implement. After that was done, I noticed that SMTP direct even offers some advantages above being quickly coded. I still think so. Nevertheless, sendmail functionality is till be scheduled. And I think there is a place, at least for some folks, to have a native email output plugin in contrast to a generic "start a program" plugin. The old rule applies: you don't have to use it if you don't like it. There are probably many more things upcoming which some folks will like and some not. For example, I intend to have an alerting capability for message never before seen in the past "n" days. Can this be done by an external tool? Sure? Does that mean it doesn't belong into rsyslog? Maybe? Does that mean I'll write an external tool? Maybe, if I can get it to be used with the right plumbing and bind it with sufficient performance and reliability to the core engine. Does it hurt if I write a plugin? I don't think so - after all, you can simply not use it... rsyslog core will still work.

Busy at the moment...

Some might have noticed that I am not as active as usual on the rsyslog project . As this seems to turn out to keep at least for the upcomi...