Tuesday, December 06, 2011

Announcing LogStore

While it probably is a bit early for a "real" announcement,  I wanted to tell a bit about the project I have been working on the past days, a dedicated storage for (sys)log messages. It will be available as part of LogTools, the actual project I am working with. A key feature of the LogStore format will be its tamper-proofness. I wanted to write such an improved storage system for quite a while. However, I have to admit that the recent journald proposal brought more life to it. While the journald proposal aims at building a kind of Window Event Log for Linux, the LogTools effort is more interested in traditional text log files (but I won't outrule going beyond that in the future).

It is important to note that LogTools, and their storage format LogStore, can protect any kind of text file. Of course, it is great for syslog logs, but you may also secure things like Apache http logs or whatever else you have in text format.

You may probably remember that I was - and still am - very skeptic about the way journald tries to secure logs via hash chains (be sure to read the comments as well!). While the journald propsal has some technical deficits, I learned that many folks we interested in this kind of hash-chaining, even though they knew it would not be truly tamperproof (I followed a lot of forum posts). Seeing this, I thought it may be useful to provide this level of protection inside some simple to use tools, what gave birth to LogTools. Still, my assesment in regard to journald holds to the current LogStore format as well: it is far from being real cryptography, it is insecure and it may be counter-productive if it generates a false sense of security. However, if one knows the limits, it can provide some useful function. So be sure to know what you do when you use these tools!

For LogStore, I have also planned to employ some real cryptography and cryptographically sign the hash chain. This will actually make the log tamperproof in a very strong sense as long as the signing key is not compromised. That functionality will be addressed once the initial release is out.

The LogStore format itself is deliberately defined in a text-tool friendly way and well documented. For your initial review, I have included its man page below.

The LogTools project is currently available only via the LogTools git. I plan to finish the remaining man pages soon (at latest this week), and then create distribution tarballs (and hopefully some simple packages, but this needs to be seen).

Feedback on this effort is appreciated.


logstore - enhanced log message storage  


The logstore is an enhanced log message storage. It can be used to store log messages in a way that secures their integrity. Currently, all data is stored in sequential files.
The logstore provides integrity checks by chaining of SHA1-checksums. Each log record (except the first) is hashed, together with the hash of the previous record. As such, manipulations inside the log store can be detected, as long as the checksums of all records are not also recomputed. Sequential logstore files are pure text files.


A sequential logstore is a text file containing variable-length records. Within each record, there is recordtype, cryptodesignator and content.
A single character designating the type of record. Currently only "m" is defined, which specified original message text.
Variable length, terminated by a colon. The is printable data that has some cryptographic function. For example, for "m"-type recrods it is the message's chained SHA1 hash.
Variable length content terminated by a LF (. For obvious reasons, LF is not permitted within the message. Also, the US-ASCII NUL character ( ) is forbidden in order to prevent trouble with text based tools. It is suggested that only printable characters are used inside the message, but this is currently not enforced.
The structure is recordtype Rsyslog has a modular design. Consequently, there is a growing number of modules. See the html documentation for their full description.


In its current form, logstore provides limited security. While it is possible to verify the correctness of the hash chain, an attacker may simply rewrite the complete file, computing new hashes. However, protection can be (manually) gained by saving the last hash inside the file to a separate location. If so, one can compare the last hash with this previously saved information and check if it is still valid. If someone mangeled the store, this will not be the case. Once the authenticy of the last hash has been proven, it is easy to verify the rest ot the file. The logreader (1) tool can be used to do that.
In the future, cryptographic signatures based on public key cryptography will be used to protect the hash chain.


The following is a minimalistic 4-line sample of a sequential logstore file.
m04e3324670626451755aa2257a9b92395e26c2e4:line 1
m347d4500ea11fa41800a58972699e57e0c0d7cd7:line 2
m3c329d40e37ae20c475c06bfaab892892ef4579d:line 3
m807cc61d7b04cbc1f048810df9d3a652988d745e:line 4
Note that all records have "m" in the first postion, designating them as message records. The cryptographic hash in line one is a SHA1 hash of just line one's content, whereas the hashes for lines n (with n>1) are taken after the concatenation of hash(n-1) and line(n), without the colon. So in order to obtain line two's hash, the following string is hashed:


logreader(1), logwriter(1), liblogtools(3)


Rainer Gerhards (rgerhards@adiscon.com)


Igor said...


are you really sure you aren't wasting your time? Don't get me wrong, but a secure storage on a rewritable medium is fiction.

You said it, too:
"As such, manipulations inside the log store can be detected, as long as the checksums of all records are not also recomputed."

Isn't that what an attacker will do? Recompute all records to hide his traces?

That's what will happen everytime: You know there's one way, then you will find it...

I understand the idea, I also like it. But because I don't see a way how this could work, why do you spend so much time on it?

What's the benefit?

I think you have heard about the German Staatstrojaner. It is important for legal usage, that you provide some kind of "revisionssicher" storage.

You wrote it, the CCC has proofed it, other have proofed it too - this isn't possible with rewritable storage mediums.

So again: What's the benefit? Why do you spend so much time on it?

Aren't you creating a imagination of security?

Please, don't get me wrong. I like your work, but I really don't understand how you could get on that train...

Currently I would recommend: Leave it asap.

larstobi said...

While it may not be possible to make it completely tamper proof, I think a major point is to make it hard to manipulate logs. This can potentially make it very hard as long as the secret key is kept secret. There are Hardware Security Modules (HSM) that can make retrieving the key very hard.

Rainer said...

Thanks for the comments. Indeed, the checksum-only thing is broken. But this is meant only as a starter, and maybe to see how much interest there actually is. I, too, was surprised how many people seemed to like the simple checksum chain that was proposed with journald. But in the light of that, I'd say there is some value in feeding this need (and there were some arguments that back this -- guess I need to write yet another blog post ;)).

HOWEVER, you can get such a message store on rewritable media "sufficiently" secure. By "sufficient" I meant secure in the same level as HTTPS, online banking, PGP, etc. are secure. Think about PGP signed mail. There is a big difference between a simple hash chain and a digital signature - less in code to do it, but in security. While it is easy to recompute hashes (as long as no record of previous hashes was saved), it is "impossible" to mangle the signature ("impossible", again, under the same constraints we do all of our crypto stuff - so if someone finds a polynominal time algorithm to factor prime numbers, no security remains for the usual algorithms).

The idea is to extend LogStore with signature records. This means at closure (or any n records written), a record is added that contains the latest hash and *that* records is digitally signed (by public/private key cryptography). In that case, you can still recompute the checksums, but then the checksums will not match the signed record (or the signature be broken, if the latter is modified as well).

So, yes, it is possible to make a log store "revisionssicher" (audit-gradness). That will also give you benefit when used as evidence in court.

Igor said...


Rainer wrote:
> HOWEVER, you can get such a message store on rewritable media
> "sufficiently" secure. By "sufficient" I meant secure in the
> same level as HTTPS, online banking, PGP, etc. are secure. Think
> about PGP signed mail.

Mhh... maybe we have to define the wanted result first.

When you are talking about HTTPS, you are talking about a secured connection. This security is based on the fact that you have to trust the remote's SSL key.
I won't go into the problems PKI is currently suffering, but all you do is trusting that you are talking to the system you want to talk to. The SSL key is your "proof", but this proof is nothing:

- You would need to know the fingerprint of every certificate in the chain to verify

- And even you know every fingerprint, you cannot be sure if some bad guy has stolen the key

So what does HTTPS do? It just securing the connection between you and the endpoint you are connected, using the known key. But it cannot guarantee that the current key user is the designated key owner. Everyone should keep this in mind.

The same applies to PGP:
A PGP signature is nothing. It's just saying, that the person who created the signature had access to the required private key and the passphrase at creation time.

As you have to trust the SSL key integrity, you have to trust the PGP's key integrity.

So what do you (and other) want to achieve/proof with that feature?

Rainer Gerhards said...

As I said, you can proof that the signature is correct assuming that the signing key was not compromised. If you assume that keys are compromised then you are of course right: with a compromised key you can not have any security at all. So under this assumption, you can never build any secure system.

My reference to https was to talk about the strength of cryptography. It is not a signature protocol.

larstobi said...

IANAL, but for the purpose of presenting a log as evidence in a court room, there is the question of secure enough. Hand signatures on paper is considered secure enough. DNA sampled by a police officer is considered secure enough. But you can never be completely certain that the police officer hasn't tampered with the DNA sample. And it is possible to forge a hand signature on paper.

With that in mind, if it is very difficult to retrieve the secret private key, then it may be secure enough for the court. If one can point to established procedures and security measures, and they are approved and trusted by the court, then the log can be used as proof.

With this in mind, I belive there is sufficient reason for creating a PKI-secured logging system. In my opinion it's not a waste of time.

simplifying rsyslog JSON generation

With RESTful APIs, like for example ElasticSearch, you need to generate JSON strings. Rsyslog will soon do this in a very easy to use way. ...