Tuesday, May 27, 2008

syslog-transport-tls-12 implementation report

I have finally managed to fully implement IETF's syslog-transport-tls-12 Internet draft plus some new text suggested to go into -13 (which is not yet out) in rsyslog. Please note that I am talking about actual software that you can download, install, run and even look at the source. So this is not a theoretical "what if" type of report but one of real practical experience.

I have roughly worked the past three weeks on the new -12 version of transport-tls. First of all, it is important to keep in mind that I already had implemented the -11 version (minus the then-unclear authentication) in rsyslog 3.19.0. That meant I had just to implement the new authentication stuff. This was obviously a major time-saver.

The current implementation utilizes the GnuTLS library for all TLS operations. I would like to thank the GnuTLS folks for all their help they provided on the mailing list. This was extremely useful. GnuTLS in rsyslog works as a "network stream driver" and can theoretically be replaced with other libraries (support for at least NSS is planned). For obvious reasons, this implementation report includes a number of GnuTLS specifics.

It is not exactly specified whether a syslog message traveling over -transport-tls must strictly be in syslog-protocol format or not. This may lead to interoperability problems. For rsyslog, I have implemented that any message format is accepted. Any message received is simply fed into the general parser-selector, which looks at the message format and selects the most appropriate parser. However, this may not really be desirable from a security point of view. When sending, rsyslog also does not demand anything specific. Due to rsyslog design, message creation and transmission are quite separate parts. So even if the draft would demand -syslog-protocol format, I would not be able to enforce that in rsyslog (it would break too many application layers). Of course, rsyslog supports -syslog-protocol format, but it requires the proper template to be applied to the send rule.

Rsyslog implements even most optional features. However, I have not implemented IP-address-based authentication, which is a MUST in Joe's new proposed text (section 5.1). The reason is that we found out this option is of very limited practical experience. IP addresses are very seldomly found in certificates. Also, there are ample ways to configure rsyslog in client role so that it knows the server's identity. This was also brought up on the IETF syslog mailing list and it looks like this feature will be dropped. Should it be actually survive and go into the final standard, I will implement it, even though I do not see any use in practice. Thus I have deferred implementation until it is actually needed. Rsyslog user feedback may also show if there is a need for this feature in practice.

Each rsyslog instance is expected to have one certificate identifying it. There can be different certificates for different senders and receivers, but this is considered the unusual case. So in general, a single certificate identifies the rsyslog instance both as a client and server.

Rsyslog support the three authentication modes laid out in the draft: anonymous, fingerprints and subject names. Obviously, anonymous authentication is easy to do. This was a quick task without any problems.

Fingerprint authentication was somewhat problematic to implement. The core problem was that GnuTLS, by default, sends only those certificates to the server that are in the server's trusted CA list. With self-signed certs on both the client and the server, this is never the case and GnuTLS does not provide any certificate at all. I used kind of a hack to get around this. There is a function in GnuTLS that permits to provide certificates on an as-needed basis. I used this hook. However, I now no longer have the ability to provide only those certificates a server can verify. When I have multiple certificate stores and the server is in subject name authentication mode, this would be valuable. So far, I have ignored this problem. If practice shows it needs attention, I will further investigate. But here is definitely a potential future trouble spot. A core problem is that a sender does not (should not need to) know if the receiver is using fingerprint or subject name authentication. For the later, the GnuTLS defaults are quite correct and provide a very convenient interface. But I can not select different modes on the client as I do not know which one is right.

Subject name based authentication did not pose any such problems. This comes at no surprise, because this is the the usual mode of operations for the TLS library. One can assume this to be best-tested.

One disappointment with GnuTLS was that during the TLS handshake procedure only basic authentication can be done. Most importantly, there is no hook that enables an application to check the remote peer's certificate and authorize it or deny access during the handshake. Authorization can only be done after the handshake has completed. Form asking around, NSS seems to provide this ability. OpenSSL, on the other hand, seems NOT to provide that hook, too (I could not verify that, though). As such, rsyslog needs to complete the handshake and then verifies fingerprint's or validates the certificate chain, expiration dates and checks the subject name. If these checks show that we are not permitted to talk to the peer, all we can do is close the connection.

If a client is connecting to a server, this is a minor annoyance, as a connection is created and dropped. As we can not communicate the reason why we close the connection, the server is left somewhat clueless and currently logs a diagnostic warning of a freshly created connection being immediately closed. I will probably change that diagnostic in the future.

Quite more problematic is the case when a server fails to authenticate the client. Here, the client received the handshake and already begun to send data when the server closes the connection. As there is no application level acknowledgment in transport-tls, the client does not know when exactly the connection is closed by the server. In the end result the client experiences message loss and may even not notice the failed connection attempt until much later (in most cases, the first message is always successfully sent and only the second message, possible hours later, will see a problem). In the end result, this can lead to massive data loss, even to complete data loss. Note that this is not a direct cause of transport-tls, but of the underlying plain TCP syslog protocol. I have more details in my blog post on the unreliability of TCP syslog.

Please note that -transport-tls does not specify when peer authentication has to happen. It may happen during the handshake but it is also valid to do it after the handshake. As we have seen, doing it after the handshake causes serious problems. It may be good to at least mention that. If the draft is changed to mandate authentication during the handshake, some implementors will probably not implement it, because the library they use does not support it. Of course, one could blame the library, but for existing products/projects, that will probably not help.

The need to authenticate during the handshake is a major problem for anyone implementing -transport-tls. For rsyslog and for now, I have decided to live with the problem, because I do have the unreliability problem in any case. My long-term goal is to switch to RELP to address this issue and provide TLS support for RELP (RELP uses app-level acks, so there is no problem with authenticating after a successful handshake - I can still emit an "authentication failed" type of message). Please note that the transport-tls specific problem only occurs if the remote client fails to authenticate - this is what make it acceptable to me. I expect this situation to be solved quickly (either something is misconfigured or an attack is going on in those cases).

As a side-note, I may see if I can provide a patch for GnuTLS if this turns out to become a major problem.

Besides implementing the required methods, I have also thought about how to create a sufficiently secure system with the least possible effort.

In home environments where the "administrator" has little or no knowledge and uses rsyslog to receive message from a few low-end devices (typically a low-end router), it is hard to think of any good security settings. Most probably, anonymous "authentication" is the best choice here. It doesn't protect against man-in-the-middle attacks, but it at least provides confidentiality for messages in transit. The key point here is that it does not require any configuration except for enabling TLS and specifying the syslog server's address in the device GUI.

Another good alternative for these environments may actually be auto-generating a self-signed cert on first rsyslogd startup. This is based on the assumption that the device GUI provides a way to view and authorize this certificate (after it has talked to the server and obtained the cert9). However, I have to admit that I see only limited advantage in implementing this. After all, if the admin is not able to configure things correctly, do we really expect him to be able to interpret and sufficiently frequently review the system logs? I'd say this is at least doubtful and so I prefer to put my implementation efforts to better uses...

The anticipated common use case for rsyslog is an environment where the administrator is at least knowledgeable enough to carry out some basic configuration steps and create certificates if instructed on which tools to run. We do not assume that a full PKI infrastructure is present. As such, we suggest that each organization creates its own CA for rsyslog use. This involves creating one root CA certificate. That certificate is then used to create certificates for each instance of rsyslog that is to be installed. There is one instance per machine. To keep configuration simple, each machine's DNS name is to be used.

All clients shall forward via a @@hostname action, where hostname must be the actual DNS name (as specified in the certificate) and not an IP address or something else. To prevent DNS failures or unavailability of DNS during startup, this name and its IP address may be set in /etc/hosts. With that configuration, the client can validate the server's identity without any extra configuration.

To achieve a similar automatic identity check on the server side (server authenticating client), subject name wildcards are used. It is suggested that all syslog client are within the same domain. Then, the server can be instructed to accept messages from all of them with a single configuration setting enabling message reception from e.g. "*.example.com". This, together with the fact that the certificate must have been signed with the "rsyslog root CA"'s certificate provides sufficient proof of identification in most cases.

In more complex scenarios, more complex authentication can be used. There will be no specific guidelines within the rsyslog documentation on other policies. It is assumed that those who have need for such complex policies know what they need to have, so there is no point in providing advise. From the engine point of view, rsyslog already provides for many advanced uses (e.g. different certificate stores for different sessions) and can easily extended to provide for others. As of my understanding, the latest text proposed by Joe permits me to do all of this under the umbrella of -transport-tls, so the draft is no limiting factor.

The bottom line is that an enterprise-specific rsyslog root CA provides quite automatic configuration of peer credentials while being easy to implement. Wildcard subject name matches play a vital role, because they are the only way to permit a server with the ability to authorize a wide range of clients in a semi-automatic manner.

IMO, subject name based authentication is easier to setup than fingerprint authentication, at least in a rsyslog-to-rsyslog case. If it is easier to setup in a heterogeneous environment depends on the ability of all peers to either generate certificate requests and accept the certificate and/or import prefabricated .pem files. If that is simple enough, subject name based authentication can be used with very low administrative overhead (but integrating it into a full-blown PKI is still another thing...).

To really prove the implementation, of course, at least one other independent implementation is needed. Currently there is none, but as it looks NetBSD's syslogd will be enhanced as a Google Summer of Code project. I am keeping an eye on that project and will try to do interop testing as soon as such is possible. Having one implementation from the "device camp" (e.g. a router) would be extremely useful, though, as that would provide more insight on how easy it will be to configure things via such an administrative interface (not in theory, but in actual implementation - I expect a difference between the two as there are always constraints that must be observed, like the overall application framework and programming tool set).

To wrap things up, -syslog-transport-tls-12+ is sufficiently easy to implement and deploy. IMHO it also provides sufficient extensibility to implement complex scenarios. Some details could be improved (when to authenticate, message format) and a decision on IP based authentication should be finalized. But I don't see any reason to hold it much longer and look forward to it being finalized.

No comments: