Martin showed that by doing a recv() call in front of the send(), reliability improves. This is because the local TCP stack is then forced to evaluate whatever the remote peer sent to it. Without that recv() call, the local stack may not evaluate the response until later. Of course, this is implementation specific, but seems to be common. From traces, I have seen that there seems to be good reason to do so: the number of packets sent is slightly lower when not doing the recv(), so it looks like the buffering is improved. On the other hand, this costs reliability as a remote shutdown is not immediately detected. I have not verified the buffering behaviour by looking at the tcp stack sources, but the traces are evidence enough. At least for practical purposes.
The recv() suggested by Martin forces the stack to evaluate packets received. If the remote end has closed the connection, that makes the local stack detect the condition. Consequently, loss of connection can be detected at this stage. Without the recv(), this needs at least one other send() call and thus causes the loss of one message. So we increase the reliability of plain tcp syslog by being able to detect the error state earlier.
However, this approach is racy. For example, if the remote peer initiated the shutdown after we did the recv() (but before the send()), this will not help. Also, it does not help if there is high traffic. In this case, the remote peer may be forced to shutdown the connection in the middle of a message. Keep in mind that a syslog server has no way of telling the client it intends to shut down. So if it must, it has no other chance than to terminate the connection forcefully (maybe after a short timeout, depending on the needs).
So, in short, the recv() trick often works if we have:
- a network with likely no packet loss
- a network with low latency
- low network traffic
The larger the environment, the more slim this chance is. Unfortunately, I assume that in enterprise environments that really need reliability, the recv() trick does not contribute much to improving reliability (because they do not have the perquisites outlined above). So for enterprises, plain tcp syslog is still not an option when there is need for reliable logging (e.g. in an auditing environment). Protocols with app-level acknowledgement are required there (e.g. RFC 3195, RELP or SETP).
I have thought long how this affects rsyslog. I have now concluded I will give the recv() trick a try. I think it can improve the situation, at least in some practical scenarios. The problem is that it is kind of a hack. I need to find a well enough way to implement it into my netstream driver layer that doesn't force the code to do something where it is illogical while at the same time keeping the driver clean to be used with more sophisticated protocols. So it is not a matter of adding a simple API call. I'll probably introduce a driver call which may do the recv(), only if it makes sense for that type of driver, and does a dummy return for those drivers that do not (need) to support it.
For completeness, I reproduce some of my comments on Martin's blog just so that I have a stable reference within a domain a manage (things too often disappear too quickly ;)).