Friday, May 09, 2008

more on syslog TLS, policies and IETF efforts...

I am still working hard on TLS support, but on the design level. Here, I'd like to reproduce a message I sent to the IETF syslog WG's mailing list. It outlines a number of important points when it comes to practical use of TLS with syslog.

I am quite happy that syslog-transport-tls got new momentum right at the time when I finished my TLS implementation in rsyslog and turned into fine-tuning it. The IETF discussion on authentication and policies actually is touching right those places where it in practice really hurts. For the initial TLS implementation, I decided to let rsyslog work in anonymous mode, only. Iit was clear that -transport-tls section 4 as in version 11 would not survive - just as we have now seen).

The next steps in rsyslog are to enable certificate based access policies and this is exactly what the IETF discussion is focusing on. Of course, I try to finish design and try to affect the standard in a positive way, so that the rsyslog implementation can both be standards-compliant and useful in practice.

And now - have an interesting read with my mailing list post. Feedback is highly appreciated.

Rainer


Hi all,

I agree to Robert, policy decisions need to be separated. I CC Pasi because my comment is directly related to IESG requirements, which IMHO cannot be delivered by *any* syslog TLS document without compromise [comments directly related to IESG are somewhat later, I need to level ground first].

Let me tell the story from my implementor's POV. This is necessarily tied to rsyslog, but I still think there is a lot of general truth in it. So I consider it useful as an example.

I took some time yesterday to include the rules laid out in 4.2 into rsyslog design. I quickly came to the conclusion that 4.2. is talking about at least two things:

a) low-level handshake validation
b) access control

In a) we deal with the session setup. Here, I see certificate exchange and basic certificate validation (for example checking the validity dates). In my current POV, this phase ends when the remote peer can positively be identified.

Once we have positive identification, b) kicks in. In that phase, we need to check (via ACLs) if the remote peer is permitted to talk to us (or we are permitted to talk to it). Please note that from an architectural POV, this can be abstracted to a higher layer (and in rsyslog it probably will). For that layer, it is quite irrelevant if the remote peer's identity was obtained via a certificate (in the case of transport-tls), a simple reverse lookup (UDP syslog), SASL (RFC 3195) or whatever. What matters is that the ACL engine got a trusted identify from the transport layer and verifies that identity [level of trust varies, obviously]. Most policy decisions happen on that level.

There is some grayish between a) and b). For example, I can envision that if there is a syslog.conf rule (forward everything to server.example.net)

*.* @@server.example.net

The certificate name check for server.example.net (using dNSName extension) could probably be part of a) - others may think it is part of b).

Also, even doing a) places some burden onto the system, like the need to have trust anchors configured in order to do the validation. This hints at at least another sub-layer.

I think it would be useful to spell out these different entities in the draft.


Coming back to policy decisions, one must keep in mind that the IESG explicitly asked for those inside the document. This was done based on the -correct- assumption that today's Internet is no longer a friendly place. So the IESG would like to see a default policy implemented that provides at least a minimum acceptable security standard. Unfortunately, this is not easy to do in the world of syslog. For the home users, we cannot rely on any ability to configure something. For the enterprise folks, we need to have defaults that do not get into their way of doing things [aka "can be easily turned off"]. There is obviously much in between these poles, so it largely depends on the use case. I have begun a wiki page with use cases and hope people will contribute to it. It could lead us to a much better understanding of the needs (and the design decisions that need to be made to deliver these). It is available at

http://wiki.rsyslog.com/index.php/TLS_for_syslog_use_cases

After close consideration, I think the draft currently fails on addressing the two use cases define above properly. Partly it fails because it is not possible under the current IESG requirement to be safe by default. We cannot be fully safe by default without configuration, so whatever we specify will fail for the home user.

A compromise may be to provide "good enough" security in the default policy. I see two ways of doing that: one is to NOT address the Masquerade and Modification threats in the default policy, just the Disclosure threat. That leads us to unauthenticated syslog being the default (contrary to what is currently implemented) [Disclosure is addressed in this scenario as long as the client configs are not compromised, which I find sufficiently enough - someone who can compromise the client config can find other ways to get hold of the syslog message content].

An alternative is to use the way HTTPS works: we only authenticate the server. To authenticate, we need to have trusted certificate inside the server. As we can see in HTTPS, this doesn't really require PKI. It is sufficient to have the server cert singed by one of few globally trusted CAs and have this root certificates distributed with all client installations as part of their setup procedure. This is quite doable. In that scenario, a client can verify a server's identity and the above sample (*.* @server.example.net) could be verified with sufficient trust. The client, however, is still not authenticated. However, the threats we intended to address are almost all addressed, except for the access control issue which is defined as part of the Masquerade threat (which I think is even a different beast and deserves its own threat definition now that I think about it). In short we just have an access control issue in that scenario. Nothing else.

The problem, however, is that the server still needs a certificate and now even one that, for a home user, is prohibitively expensive. The end result will be that people turn off TLS, because they neither know how to obtain the certificate nor are willing to trade in a weekend vacation for a certificate ;) In the end result, even that mode will be less useful than anonymous authentication.

The fingerprint idea is probably a smart solution to the problem. It depends on the ability to auto-generate a certificate [I expressed that I don't like that idea yesterday, but my thinking has evolved ;)] OR to ship every device/syslogd with a unique certificate. In this case, only minimal interaction is required. The idea obviously is like with SSH: if the remote peer is unknown, the user is queried if the connection request is permitted and if the certificate should be accepted in the future. If so, it is added permanently to the valid certificate store and used in the future to authenticate requests from the same peer. This limits the security weakness to the first session. HOWEVER, the problem with syslog is that the user typically cannot be prompted when the initial connection happens (everything is background activity). So the request must actually be logged and an interface be developed that provides for user notification and the ability to authorize the request.

This requires some kind of "unapproved certificate store" plus a management interface for it. Well done, this may indeed enable a home user to gain protection from all three threats without even knowing what he really does. It "just" requires some care in approving new fingerprints, but that's a general problem with human nature that we may tackle by good user interface desig but can't solve from a protocol point of view.

The bad thing is that it requires much more change to existing syslogd technology. That, I fear, reduces acceptance rate. Keep in mind that we already have a technically good solution (RFC 3195) which miserably failed in practice due to the fact it required too much change.

If I look at *nix implementations, syslogd implementers are probably tempted to "just" log a message telling "could not accept remote connection due to invalid fingerprint xx:xx:..." and leave it to the user to add it to syslog.conf. However, I fear that for most home setups even that would be too much. So in the end effect, in order to avoid user hassle, most vendors would probably default back to UDP syslog and enable TLS only on user request.

From my practical perspective this sounds even reasonable (given the needs and imperfections of the real world...). If that assessment is true, we would probably be better off by using anonymous TLS as the default policy, with the next priority on fingerprint authentication as laid out above. A single big switch could change between these two in actual implementations. Those users that "just want to get it running" would never find that switch but still be somewhat protected while the (little) more technically aware can turn it to fingerprint authentication and then will hopefully be able to do the remaining few configuration steps. Another policy is the certificate chain based policy, where using public CAs would make sense to me.

To wrap it up:

1. I propose to lower the default level of security
for the reasons given.
My humble view is that lower default security will result in higher
overall security.

2. We should split authentication policies from the protocol itself
... just as suggested by Robert and John. We should define a core
set of policies (I think I described the most relevant simple
cases above, Robert described some complex ones) and leave it
others to define additional policies based on their demand.

Policies should go either into their own section OR into their own documents. I have a strong favor of putting them into their own documents if that enables us to finally finish/publish -transport-tls and the new syslog RFC series. If that is not an option, I'd prefer to spend some more work on -transport-tls, even if it delays things further, instead of producing something that does not meet the needs found in practice.

Rainer


> -----Original Message-----
> From: syslog-bounces@ietf.org [mailto:syslog-bounces@ietf.org] On
> Behalf Of robert.horn@agfa.com
> Sent: Thursday, May 08, 2008 5:53 PM
> To: Joseph Salowey (jsalowey); syslog@ietf.org
> Subject: Re: [Syslog] I-D Action:draft-ietf-syslog-transport-tls-12.txt
>
> Section 4.2 is better, but it still needs work to separate the policy
> decisions from the protocol definition. Policy decisions are driven by
> risk analysis of the assets, threats, and environment (among other
> things). These are not uniform over all uses of syslog. That makes it
> important to separate the policy from the protocol, in both the
> specifications and in the products.
>
> In the healthcare environment we use TLS to protect many of our
> connections. This is both an authentication protection and a
> confidentiality protection. The policy decisions regarding key
> management
> and verification will be very similar for a healthcare use of syslog.
> Some
> healthcare sites would reach the same policy decision as is in 4.2, but
> here are three other policy decisions that are also appropriate:
>
> Policy A:
> The clients are provided with their private keys and the public
> certificates for their authorized servers by means of physical media,
> delivered by hand from the security office to the client machine
> support
> staff. (The media is often CD-R because it's cheap, easy to create,
> easy
> to destroy, and easy to use.) During TLS establishment the clients use
> their assigned private key and the server confirms that the connection
> is
> from a machine with one of the assigned private keys. The client
> confirms
> that the server matches one of the provided public certificates by
> direct
> matching. This is similar to the fingerprint method, but not the same.
> My
> most recent experience was with an installation using this method. We
> had
> two hours to install over 100 systems, including the network
> facilities.
> This can only be done by removing as many installation schedule
> dependencies as possible. The media method removed the certificate
> management dependencies.
>
> Policy B:
> These client systems require safety and functional certification
> before
> they are made operational. This is done by inspection by an acceptance
> team. The acceptance team has a "CA on a laptop". After accepting
> safety
> and function, they establish a direct isolated physical connection
> between
> the client and the laptop. Then using standard key management tools,
> the
> client generates a private key and has the corresponding public
> certificate generated and signed by the laptop. The client is also
> provided with a public certificate for the CA that must sign the certs
> for
> all incoming connections.
>
> During a connection setup the client confirms that the server key has
> been
> signed by that CA. This is similar to a trusted anchor, but not the
> same.
> There is no chain of trust permitted. The key must have been directly
> signed by the CA. During connection setup the server confirms that the
> client cert was signed by the "CA on a laptop". Again, no chain of
> trust
> is permitted. This policy is incorporating the extra aspect of "has
> been
> inspected by the acceptance team" as part of the authentication
> meaning.
> They decided on a policy-risk basis that there was not a need to
> confirm
> re-inspection, but the "CA on a laptop" did have a revocation server
> that
> was kept available to the servers, so that the acceptance team could
> revoke at will.
>
> Policy C:
> This system was for a server that accepted connections from several
> independent organizations. Each organization managed certificates
> differently, but ensured that the organization-CA had signed all certs
> used for external communications by that organization. All of the
> client
> machines were provided with the certs for the shared servers (by a
> method
> similar to the fingerprint method). During TLS connection the clients
> confirmed that the server cert matched one of the certs on their list.
> The
> server confirmed that the client cert had been signed by the CA
> responsible for that IP subnet. The server was configured with a list
> of
> organization CA certs and their corresponding IP subnets.
>
> I do not expect any single policy choice to be appropriate for all
> syslog
> uses. I think it will be better to encourage a separation of function
> in
> products. There is more likely to be a commonality of configuration
> needs
> for all users of TLS on a particular system than to find a commonality
> of
> needs for all users of syslog. The policy decisions implicit in
> section
> 4.2 make good sense for many uses. They are not a complete set. So a
> phrasing that explains the kinds of maintenance and verification needs
> that are likely is more appropriate. The mandatory verifications can
> be
> separated from the key management system and kept as part of the
> protocol
> definition. The policy decisions should be left as important examples.
>
> Kind Regards,
>
> Robert Horn | Agfa HealthCare
> Research Scientist | HE/Technology Office
> T +1 978 897 4860
>
> Agfa HealthCare Corporation, 100 Challenger Road, Ridgefield Park, NJ,
> 07660-2199, United States
> http://www.agfa.com/healthcare/
> Click on link to read important disclaimer:
> http://www.agfa.com/healthcare/maildisclaimer
> _______________________________________________
> Syslog mailing list
> Syslog@ietf.org
> https://www.ietf.org/mailman/listinfo/syslog

No comments: