Thursday, January 15, 2009

Use of application-level acks in RELP

I received a very well crafted question about RELP reliability via the rsyslog mailing list this morning. I think it makes perfect sense to highlight this question here in the blog instead of letting it die unread and hard to find in the mailing list archives. Before reading this post, it would be useful to read my rant on "On the unreliability of plain tcp syslog" if you have not already done so. It will greatly help understand the fine details of what the message talks about.

Here we go, original posters's text in italics, my replies in between it:

In my research of rsyslog to determine its suitability for a
particular situation I have some questions left unanswered. I need
relatively-guaranteed delivery. I will continue to review the
available info including source code to see if I can answer the
questions, but I hope it may be productive to ask questions here.

In the documentation, you describe the situation where syslog silently
loses tcp messages, not because the tcp protocol permits it but
because the send function returns after delivering the message to a
local buffer before it is actually delivered.

But there is a more-fundamental reason an application-level ack is
required. An application can fail (someone trips over the power cord)
between when the application receives the data and when it records it.

1. Does rsyslog send the ack in the RELP protocol occur after the
message has been safely recorded in whatever queue has been configured
or forwarded on so its delivery status is as safe as it will get (of
course how safe depends upon options chosen), or was it only intended
to solve the case of TCP buffering-based unreliability?


RELP is designed to provide end-to-end reliability. The TCP buffering issue is just highlighted because it is so subtle that most people tend to overlook it. An application abort seems to be more obvious and RELP handles that.

HOWEVER, that does not mean messages are necessarily recorded when the ACK is sent. It depends on the configuration. In RELP, the acknowledgment is sent after the reception callback has been called. This can be seen in the relevant RELP module. For rsyslog's imrelp, this means the callback returns after the message has been enqueued in the main message queue.

It now depends on how that queue is configured. By default, messages are buffered in main memory. So when rsyslog aborts for some reason (or is terminated by user request) before this message is being processed, it is lost - while the sender still got a positive ACK. This is how things are done by default, and it is useful for many scenarios. Of course, it does not provide the audit-grade reliability that RELP aims for. But the default config needs to take care of the usual use case and this is not audit-grade reliablity (just think of the numerous home systems that run rsyslog and should do so in the least intrusive way).

If you are serious about your logs, you need to configure the engine to be fully reliable. The most important thing is a good understanding of the queue engine. You need to read and understand the rsyslog queue docs, as they form the basis on which reliability can be built.

The other thing you need to know is your exact requirements. Asking for reliability is easy, implementing it is not. The more you near 100% reliability (which you will never reach for one reason or the other) the more complex scenarios get. I am sure the original post knows quite well what he want, but I am often approached by people who just want to have it "totally reliable" ... but don't want to spent the fortune it requires (really - ever thought about the redundant data centers, power plants, satellite and sea links et all you need for that?). So it is absolutely vital to have good requirements, which also includes of when loss is acceptable, and at what cost this comes.

Once you have these requirements, a rsyslog configuration that matches them can be designed.

At this point, I'd like to note that it may also be useful to consider rsyslog professional services as it provides valuable aid during design and probably deployment of a solution (I can't go into the full depth of enterprise requirements here).

To go back to the original question: RELP has almost everything that is needed, but configuring the whole system in an audit-grade way requires (ample) work.
2. Presumably there is a client API that speaks RELP. Can it be
configured to return an error to the client if there is no ACK (i.e.
if the log it sent did not make it into the configured safe location
which could be on a disk-based queue), or does it only retry? Where is
this API?


The API is in librelp. But actually this is not what you are looking for. In rsyslog, an output module (here: omrelp) provides the status back to the caller. Then, configuration decides what happens. Messages may be discarded, sent to a different destination or retried.

With omrelp, I think we have some hardcoded ways to preserve the message, but I have no time yet to look this up in detail. In any case, RELP will not loose messages but may duplicate few of them (within the current unacked window) if the remote peer simply dies. Again, this requires proper configuration of the rsyslog components.

Even with that, you may loose messages if the local rsyslogd dies (not terminates, but dies for some unexpected reason, e.g. a segfault, kill -9 or whatever) but still has messages in a not persisted queue. Again, this can be mitigated by proper configuration, but that must be designed. Also, it is very costly in terms of performance. A good reading on the subtleties can be in the rsyslog mailing list archive. I suggest to have a look at it.
Certainly the TCP caching case you mention in your pages is one a user
is more likely to be able to reproduce, but that is all the more
reason for me to be concerned that the less-reproducible situations
that could cause a message to occasionally become lost are handled
correctly.


I don't think app-abort is less reproducable - kill -9 `cat /var/run/rsyslog.pid` will do nicely. Actually, from feedback I received, many users seem to understand the implications of a program/system abort. But far fewer understand the issues inherent in TCP. Thus I am focusing so much on the later. But of course, everything needs to be considered. Read the thread about the reliable queue (really!). It goes great lengths, but still does not offer a full solution. Getting things reliable (or secure) is very, very challenging and requires in-depth knowledge.

So I am glad you asked and provided an opportunity for this to be written :)

Rainer

No comments: