There is a move afoot among Web browser developers to remove an authentication mechanism that many enterprises depend on: SSL/TLS with X.509 client certificates. Client certificate support, along with related functionality for enrollment of clients, was first implemented in Netscape 4 (it’s that old), but since browser developers don’t work for big enterprises (leaving IE aside for the moment) they never exercise this functionality — and as a result it is frequently broken and the UX can be pretty horrible. Nonetheless, it’s very important functionality for a lot of enterprises, and I’ve been asked to write a bit about my experience. I have a unique perspective: I work at one of the host institutions for the World Wide Web Consortium, and not only do I operate an enterprise certificate authority, I have actually written an enterprise certificate authority that is in daily production use.
Some legitimate complaints
At work, we depend extensively on client certificates for authentication, and we consider this functionality extremely important, for reasons I’ll get into in detail below. Nonetheless, we receive numerous complaints from users about the use of certificates, and I wanted to be up-front about some of these complaints and explain the legitimate reasons for them. Here are the biggest complaints we receive:
It doesn’t work in $BROWSER!
This has always been true for IE, but almost nobody cares about IE any more — we have very few Windows users. With a great deal of effort, I made it work in IE, once, until Microsoft broke the only documented interface for doing certificate enrollment with the release of Windows XP SP2 (which gives you some idea how long ago this was). Microsoft has some specific requirements for certificate enrollment, driven by the legitimate needs of their enterprise customers, which made it impossible to use the enrollment mechanism as Netscape originally specified it — but their alternative required a large amount of proprietary browser-side programming, and of course makes the usual assumption of Web people that you have a team of twenty full-time devs working on every site so it’s OK to flash-cut programming interfaces with no transition period, warning, or even conversion documentation.
Today, we usually hear this complain with regard to Google Chrome. Chrome uses the “native” certificate store for each platform it runs on, but the Chrome developers never fully implemented the enrollment protocol, so it can’t parse what it gets back from our CA — and doesn’t explain why it didn’t work (I suppose “WCBA to implement the full standard and we’re Google so we don’t have to follow standards” is a bit much to ask for an error message but wouldn’t some honesty be nice?).
It works on $SITE_X but it doesn’t work on $SITE_Y!
There are several ways that this can happen, one of which is a clear failure on our part, but the others are just bad browser implementation. We screw this up as a result of not having fully migrated all of our services to our modern configuration management system, so some Web servers are simply misconfigured (don’t have the proper certificate chain, have an expired copy of the Client CA certificate, or do other things that just don’t work reliably).
The browser people, on the other hand, typically break this by doing something evil: browsers now love to go off and fetch random URLs they happen to have seen at times when users aren’t expecting them, causing the client-certificate-selection dialog to pop up when the user is doing something totally unrelated. The confused user will often cancel out of the dialog, or select a certificate related to the thing they actually are trying to do, which the browser will “helpfully” cache for the rest of the browser session — and certificate selection state can only be cleared by exiting the browser! So (to give an example we recently discovered), a user who goes to our main Web site might look in the “Resources” menu for a link to a staff-only resource that is certificate protected. If that user is using a recent version of Firefox, merely hovering over one of these links will cause the browser to open a connection to the server (which is evil and vile and wrong for a whole bunch of reasons, not least of them privacy and security), which will cause a certificate-selection dialog to pop up for no apparent reason. If the user hits “cancel”, and they have the “remember this choice” checkbox selected (the alternative, “ask every time”, is even more horrible), they will not be able to access that service using their certified identity for the life of the browser session. (Many of our sites allow guest access, so they may be able to see the site but find themselves unable to do anything. But even on sites that absolutely require a certificate, Firefox will never present the certificate selector again, no matter what error is returned by the server, so the user is just stuck.)
It worked yesterday but now it doesn’t!
As something of a corollary to the previous problem, browsers tend to deal horribly with the fact that certificates expire, and not all that well with the fact that clients may have multiple identities attested by certificates from different authorities. Our experience (again, mostly in Firefox) is that browsers will cheerfully allow users to select certificates that are flat-out invalid, and then fail, even when the server clearly indicates that this isn’t going to work. Some other browsers (not Firefox in this case) ignore the list of acceptable client CAs sent by the server and allow the client to select a certificate that the server cannot possibly validate, or if they can validate it, contains no useful information about the user’s identity. (I’m looking at you, Safari!) It is perfectly legitimate to hang on to expired certificates (and their corresponding private keys), particularly if they are marked (as ours are) as being valid for use in email encryption, but users should never be invited to initiate a new communication, whether HTTPS or email, using an expired certificate.
We have this issue perhaps more than some other enterprises because our users have two distinct identities, each of which has an associated client certificate, and these certificates are issued by different CAs and expire at different times. (Our parent organization issues certs that expire annually on July 31; our CA — the one that I wrote — issues certs that expire 365 days after issue, or when the user’s account expires, whichever comes first.) We could and probably should fix this, by aligning our certificate (and account) expiration policies with our parent’s, but it does highlight the confusion caused by poor browser implementations.
So why do I think certificates are a good thing?
With all these problems, you can imagine that there’s a lot of pressure to stop using client certificates for authentication. Nonetheless, we continue to use them, and if the browser vendors can be persuaded not to break them, we will probably keep on doing so for some time in the future. We do have reasons for this, which I expect a lot of VC-fueled Web developers probably won’t understand (or will chuckle and say “oh, I remember when people used to do that…”). But client certificates really do solve real problems for us, for which there simply is no alternative.
Our users are subject to many phishing and related social-engineering attacks, not all of which can be detected or prevented by our email system. Client certificates are supposed to be impractical to forge, and the browser-based UX for certificate enrollment, although it sucks mightly in many ways, is very difficult to counterfeit without directly compromising the browser itself. We regularly and repeatedly emphasize to our users that they should never under any circumstances give their password to any Web site, or store it in any sort of persistent storage, except when requesting a client certificate from our CA. We don’t even implement a password-change Web page, as our parent organization does: if you forgot your password, you’re going to have to bring a photo ID to the helpdesk to change it. If the users do as we ask (which of course we can’t ever guarantee), the CA is the only Web server that will ever have access to their actual login passwords, and since access to the CA is carefully controlled, the possibility of compromise is limited. Thus far we have never seen a phishing attack, not even a spearphishing attack, that actually walked the user through exporting their private key from the certificate store and transmitting it to an attacker. (Not to suggest that such things couldn’t happen, but even our least-sophisticated users would likely realize something was up. This is also the threat model which hardware security modules and smart cards were designed to address.)
Shared Web servers
The vast majority of our Web content is served directly from shared network filesystems by general-purpose Web servers running Apache. Users need to limit access to some of this content, and it is simply not safe for these servers to allow users to use their normal passwords for authentication. The combination of Apache .htaccess files and client certificates allows us to give all our users individual control over who is able to access their internal-use content, without expecting them to securely set up and manage password-based authentication — even if we didn’t know full well that most of them don’t have that sort of expertise. (We have a difficult enough time when they install third-party software that expects them to do this; our officially supported Web server platform shouldn’t lead them down a path we know to be a serious problem.) Many alternative solutions assume that Web servers run a single application with a unitary model of access control, such that URL-space can be neatly partitioned into “public” and “private” in a way that simply doesn’t fit this use case at all.
Offline, third-party verification
Unlike nearly every alternative that’s been proposed, X.509 certificates by their very nature allow offline verification. A Relying Party can validate a certificate without revealing to the Identity Provider that it is doing so, or that any particular user is being authenticated. This is an important and powerful privacy protection — even in the case, as in our organization, where our CPS explicitly says “if you aren’t part of our organization you shouldn’t be relying on our certificates”. Nobody needs my consent, or even my cooperation, to validate the certificates I issue: all they need is a copy of my CA’s certificate, which they can easily get from the CA itself or from anyone who already holds a client cert. (There are some legitimate issues surrounding certificate revocation which, depending on your security requirements, may require online checking — to preserve privacy, browsers could implement OCSP stapling for client certificates. We don’t implement OCSP, and while we do publish CRLs for all of our CAs, I’m not aware of any RPs, internal or otherwise, who actually consult them. In general, certificate validity is not a substitute for authorization checks, although many of our services are open to all authenticated users.)
User control of authentication identity
It already works
My entire enterprise CA, including revocation, enrollment, and CRL management, is implemented in less than 4,000 lines of Ruby code. This was only possible because the important parts — the basic X.509 functionality, SSL/TLS, and certificate enrollment — are implemented in established software systems with stable programming interfaces. Certificate enrollment, in particular, is a minefield: absent the Netscape <KEYGEN> element and related hacks, it would require multiple developers with a great deal of security and platform expertise to support enrollment across six platforms, five browsers, and a handful of non-browser applications — developers my shop just doesn’t have and never will. Any proposed replacement that isn’t cross-platform, requires a significant amount of browser-side programming, or doesn’t support non-browser applications, is a non-starter for us.
So what’s wrong with the status quo?
Given all this, you might ask why the browser people hate client certificates and want to get rid of the enrollment functionality. I have to believe a big part of it is simply that they don’t live in the sorts of enterprises that make extensive use of certificates. Most private-label CAs are implemented by Microsoft shops, and Microsoft supports these users with dedicated features and options in Windows Server and IE — but those shops are generally very large, have a significant investment in Microsoft-specific solutions, and are able to simply order their users to use IE for corporate business. As a result, there is probably less pressure on the other browser vendors (Apple, Mozilla, and Google) to get this stuff right, or even implement it at all — as witness the fact that Chrome never has. There are a few more specific objections that are brought up, which are for the most part specific to how certificate enrollment works today (with <KEYGEN> and related hacks), which are worth going over.
The UX is horrible
Well, whose fault is that? You’re the browser vendors, you control the whole UI, you can fix it.
It breaks the Same Origin Policy
This is a matter of dogma to the browser people approaching the status of a religion, and in the case of client certificates it misses the whole point of why we want client certificates in the first place. That being said, I can see no reason to object to implementing a mechanism that would restrict client certificates to a specific set of origins — either as an X.509v3 extension in the certificate (so it could be signed by the CA) or as part of the enrollment process, so long as the CA was free to say “any origin is OK by me”). This assumes, of course, that the principle of user choice is maintained: I should be able to choose, on a site by site basis, whether to present a certificate at all, and if so, which one. (And, unlike in all current browser implementations, I should be able to change my mind without restarting my browser session!)
It uses MD5
This is a very specific objection to the Netscape-originated signedPublicKeyAndChallenge object used in the <KEYGEN> protocol, and for some use cases it’s entirely legitimate. For a CA like mine, it’s totally irrelevant: the only thing I care about from the SPKAC is the public key itself, since I’ve received it over a secure channel from an authenticated user. The signature on the SPKAC object serves as proof that the user submitting the enrollment request is actually in posession of the private key corresponding to the public key. But you still have to actually have the private key in order to be able to authenticate to our other servers, so forging an SPKAC doesn’t buy you much (and leaves fingerprints behind in the CA logs). If you could forge an SPKAC corresponding to some other user’s public key, you would be able to pass off a document signed by that other user as your own, which is not trivial, but also not a significant threat in our environment where certificates are only issued to authenticated local users. For a public CA issuing certificates for code signing or privacy-enhanced mail, this is a serious issue.
The solution to this problem is pretty simple: get rid of the signedPublicKeyAndChallenge object and use a standard PKCS#10 certificate signing request instead — which is what Microsoft already does in IE. I’d argue that the way it should actually work is as follows:
- User authenticates to CA.
- CA generates an unsigned “proposed CSR” object, filled in with subject DN and extensions as the CA intends to issue the certificate (including origin restrictions as noted above).
- Browser presents a counterfeit-resistant dialog to the user indicating that they are about to request a certificate which will identify them as such-and-so and giving them the opportunity to refuse
- Browser generates a keypair according to the CA’s stated requirements and signs the CSR, then submits the CSR to the CA via a form submission or JS callback.
- The CA responds, perhaps immediately, perhaps at some remove of time, with a signed certificate.
Note that enterprise CAs will typically ignore most or all of the identity information contained in a PKCS#10 object in favor of out-of-band authentication (what’s what I did for old IE enrollment) — but having a CSR signed by the user with those fields in it is valuable evidence of consent, which the signedPublicKeyAndChallenge does not offer.
The certificate installation media type is a horrible hack
Yup. I’d be happy to do it some other way, so long as the client-side programming required for it is (a) minimal and (b) standardized across all browsers. Currently only Firefox supports it properly (as might be expected since they inherited the code from Netscape that was the original implementation).
We have business requirements for private keys that <KEYGEN> doesn’t support
I hope I’ve managed to describe some of the motivation for wanting client certificates to continue to work, and described some appropriate low-impact solutions to the legitimate issues that they do raise.