rsyslog's first signature provider: why Guardtime?

The new has already spread: rsyslog 7.3 is the first version that natively supports log signatures, and does so via a newly introduced open signature provider interface. A lot of my followers quickly realized that and begun to play with it. To make sense of the provider interface, one obviously also needs a signature provider. I selected the keyless signature infrastructure (KSI), which is being engineered by the OpenKSI group. Quickly, I was asked what were the compelling reasons to provide the first signature provider for this technology.

So rather than writing many mails, I thought I blog about the reason ;)

We need to dig a little back in history to understand the decision. I looked at log signature a long time ago, I think my interest started around 2001 or 2002, as part of the IETF syslog standardization efforts. I have to admit that at this time there was very limited public interest in signed logs, even though without signatures it is somewhat hard to prove the correctness of log records. Still, many folks successfully argued that they could proof their process was secure, and that proof seemed to be sufficient, at least at that time.

A core problem with log record signatures is that a single log record is relatively small, typically between 60 bytes and 2k, with typical Linux logs being between 60 and 120 bytes. This makes it almost impossible to sign a single log record, as the signature itself is much larger. Also, while a per-record signature can proof the validity of the log record, it cannot proof that the log file as such is complete. This second problem can be solved by a method that is called “log chaining“, where each of the log records is hashed and the previous hash is also used as input for hashing the current record. That way, removing or inserting a log record will break the hash chain, and so tampering can be detected (of course, tampering within a log record can also be easily detected, as it obviously will invalidate the chain as well).

This method is actually not brand new technology but ages old. While it sounds perfect, there are some subtle issues when it comes to logging. First and foremost, we have a problem when the machine that stores the log data itself is being compromised. For the method to work, that machine must both create the hashes and sign them. To do so, it must have knowledge of the secrets being used in that process. Now remember from your crypto education that any secrets-based crypto system (even PKI) is only secure as long as the secrets have not been compromised. Unfortunately, if the loghost itself has been compromised, that precondition does not hold any longer. If someone got root-level access, he or she also has access to the secrets (otherwise the signature process could also not access them, right?).

You may now try as hard as you like, but if everything is kept on the local machine, a successful attacker can always tamper the logs and re-hash and re-sign them. You can only win some advantage if you ship part of the integrity proof off the local system – as long as you assume that not all of the final destinations are compromised (usually a fair assumption, but sometimes questionable if everything is within the same administrative domain).

The traditional approach in logging is that log records are being shipped off the machine. In IETF efforts we concentrated on that process and on the ability to sign records right at the originator. This ultimately resulted in RFC 5848, “signed syslog messages”, which provides the necessary plumbing to do that. A bit later, at Mitre’s CEE effort, we faced the same problem and discussed a similar solution. Unfortunately, there is a problem with that approach: in the real-world, in larger enterprises, we usually do not have a single log stream, where each record is forwarded to some final destinations. Almost always, interim relays filter messages, discard some (e.g. noise events) and transform others. The end result is that a verification of the log stream residing at the central loghost will always fail. Note that this is not caused by failure in the crypto method used – it’s a result of operational needs and typical deployments. Those interested in the fine details may have a look at the log hash chaining paper I wrote as an aid during CEE discussions. In that paper, I proposed as an alternative method that just the central log host signs the records. Of course, this still had the problem of central host compromise.

Let me sum up the concerns on log signatures:

signing single log records is practically impossible
relay chains make it exceptionally hard to sign at the log originator and verify at the central log store destination
any signature mechanism based on locally-stored secrets can be broken by a sufficiently well-done attack.

These were essentially the points that made me stay away from doing log signatures at all. As I had explained multiple times in the past, that would just create a false sense of security.

The state of all the changed a bit after the systemd journal was pushed into existence with the promise that it would provide strong log integrity features and even prevent attacks. There was a lot of vaporware associated with that announcement, but it was very interesting to see how many people really got excited about it. While I clearly described at that time how easy the system was to break, people begun to praise it so much that I quickly developed LogTools, which provided exactly the same thing. The core problem was that both of them were just basic log hash chains, and offered a serious amount of false sense of security (but people seemed to feel pampered by that…).

My initial plan was to tightly integrate the LogTools technology into rsyslog, but my seriousness and my concerns about such a bad security took over and I (thankfully) hesitated to actually do that.

At this point of the writing credits are due to the systemd journal folks: they have upgraded their system to a much more secure method, which they call forward secure sealing. I haven’t analyzed it in depth yet, but it sounds like it provides good features. It’s a very new method, though, and security folks for a good reason tend to stick to proven methods if there is no very strong argument to avoid them (after all, crypto is extremely hard, and new methods require a lot of peer review, as do new implementations).

That was the state of affairs at the end of last year. Thankfully I was approached by an engineer from Guardtime, a company that I learned is deeply involved in the OpenKSI approach. He told me about the technology and asked if I would be interested in providing signatures directly in rsyslog. They seemed to have some experience with signing log files and had some instructions and tools available on their web site as well as obviously some folks who actually used that.

I have to admit that I wasn’t too excited at that time and very busy with other projects. Anyhow, after the Fedora Developer Conference in February 2013 I took the time to have a somewhat deeper look at the technology – and it looked very interesting. It uses log chains, but in a smart way. Instead of using simple chains, it uses so-called Merkle trees, a data structure that was originally designed for the purpose of signing many documents very efficiently. They were invented back in the 1970’s and are obviously a quite proven technology. An interesting fact about the way Merkle trees are used in the OpenKSI approach is that they permit to extract a subset of log information and still prove that this information is valid. This is a very interesting property when you need to present logs as evidence to the court but do not want to disclose unrelated log entries.

While all of this is interesting, the key feature that attract my interest is the “keyless” inside the KSI name. If there is no key, an attacker can obviously not compromise it. But how will signing without a key work? I admit that at first I had a hard time understanding that, but the folks at Guardtime were very helpful in explaining how it works (and had a lot of patience with me ;)). Let me try to explain in a nutshell (and with a lot of inaccuracies):

The base idea goes back to the Merkle tree, but we can go even more basic and think of a simple hash chain. Remember that each hash is depending on its predecessor and the actual data to be hashed. If you now create a kind of global hash chain where you submit everything you ever need to “sign”, this can form a chain in itself. Now say you have a document (or log) x that you want to sign. You submit x to the chaining process and receive back a hash value h. Now you store that h, and use it as a signature, a proof of integrity of x. Now assume that there is a way that you give out x and h and someone can verify that x participated in the computation of the “global” log chain. If so, and if x’s hash stil matches the value that was used to generate h, than x is obviously untampered. If we further assume that there is a fixed schedule on which some “anchor hashes” are being produced, and assume that we can track to which such anchor hash h belongs to, we can even deduce at what time x was signed. In essence, this is what Guardtime does. They operate the server infrastructure that does this hashing and timestmaping. The key hashes are generated once a second, so each signature can be tracked very precisely to the time it was generated. This is called “linked timestamping” and for a much better description than I have given just follow that link ;)

The key property from my PoV is that with this method, no secrets are required to sign log records. And if there is no secret, an attacker can obviously not take advantage of the secret to hide his tracks. So this method actually provides a very good integrity proof and does not create a false sense of security. This removed a major obstacle that always made me not like to implement log signatures.

The alert reader may now ask: “Doesn’t that sound too good? Isn’t there a loophole?“. Of course, and as always in security, one can try to circumvent the mechanism. A very obvious attack is that an attacker may still go ahead and modify the log chain, re-hash it, and re-submit it to the signature process. My implementation inside rsyslog already makes this “a bit” hard to do because we keep log chains across multiple log files and the tooling notifies users if a chain is re-started for some reason (there are some valid reasons). But the main answer against these types of attacks is that when a re-hashing happens, the new signature will bear a different timestamp. As such, if comparing the signature’s timestamp and the log record timestamps, tampering can be indicated. In any case, I am sure we will see good questions and suggestions on how to improve my code. What makes me feel about the method itself is that it

bases on well-known and proven technology (with the Merkle tree being one foundation)
is in practical use for quite some while
ample of academic papers exist on it
there is a vital community driving it forward and enhancing the technology

So everything looks very solid and this is what triggered my decision to finally implement log signature directly into the rsyslog core. And to do it right, I did not create a KSI-specific interface but rather modeled a generic signature provider interface inside rsyslog. So if for whatever reason you don’t like the KSI approach, there is a facility inside rsyslog that can be used to easily implement other signature providers. Due to the reasoning given above, however, I have “just” implemented the KSI one for now.

As a side-note, please let me say that the Guardtime folks were very easy to work with and very cooperative, especially when I asked my initial though and very skeptic questions. They also are very open (lot’s of their tooling is open source) and … smart ;) During the project, our work relationship grew very much and the even managed to get me in direct contact with Ahto Buldas, who is a key inventor behind that technology (and a very impressive guy!). I am looking forward to continue to work with them in the future. In April 2013 they invited me to their Estonian office, where we had three days to discuss things like the network-signature problem (with relays discarding logs) and related problems like multi-tenancy and centralized virtual machine logging. These were very good discussions and some cool technology is probably coming out of them. I am tempted to write a bit more about these topics (as far as they have been tackled so far), but this posting is already rather long, so let’s save this for another one ;)