There is a move afoot among Web browser developers to remove an authentication mechanism that many enterprises depend on: SSL/TLS with X.509 client certificates. Client certificate support, along with related functionality for enrollment of clients, was first implemented in Netscape 4 (it’s that old), but since browser developers don’t work for big enterprises (leaving IE aside for the moment) they never exercise this functionality — and as a result it is frequently broken and the UX can be pretty horrible. Nonetheless, it’s very important functionality for a lot of enterprises, and I’ve been asked to write a bit about my experience. I have a unique perspective: I work at one of the host institutions for the World Wide Web Consortium, and not only do I operate an enterprise certificate authority, I have actually written an enterprise certificate authority that is in daily production use.
Some legitimate complaints
At work, we depend extensively on client certificates for authentication, and we consider this functionality extremely important, for reasons I’ll get into in detail below. Nonetheless, we receive numerous complaints from users about the use of certificates, and I wanted to be up-front about some of these complaints and explain the legitimate reasons for them. Here are the biggest complaints we receive:
It doesn’t work in $BROWSER!
This has always been true for IE, but almost nobody cares about IE any more — we have very few Windows users. With a great deal of effort, I made it work in IE, once, until Microsoft broke the only documented interface for doing certificate enrollment with the release of Windows XP SP2 (which gives you some idea how long ago this was). Microsoft has some specific requirements for certificate enrollment, driven by the legitimate needs of their enterprise customers, which made it impossible to use the enrollment mechanism as Netscape originally specified it — but their alternative required a large amount of proprietary browser-side programming, and of course makes the usual assumption of Web people that you have a team of twenty full-time devs working on every site so it’s OK to flash-cut programming interfaces with no transition period, warning, or even conversion documentation.
Today, we usually hear this complain with regard to Google Chrome. Chrome uses the “native” certificate store for each platform it runs on, but the Chrome developers never fully implemented the enrollment protocol, so it can’t parse what it gets back from our CA — and doesn’t explain why it didn’t work (I suppose “WCBA to implement the full standard and we’re Google so we don’t have to follow standards” is a bit much to ask for an error message but wouldn’t some honesty be nice?).
It works on $SITE_X but it doesn’t work on $SITE_Y!
There are several ways that this can happen, one of which is a clear failure on our part, but the others are just bad browser implementation. We screw this up as a result of not having fully migrated all of our services to our modern configuration management system, so some Web servers are simply misconfigured (don’t have the proper certificate chain, have an expired copy of the Client CA certificate, or do other things that just don’t work reliably).
The browser people, on the other hand, typically break this by doing something evil: browsers now love to go off and fetch random URLs they happen to have seen at times when users aren’t expecting them, causing the client-certificate-selection dialog to pop up when the user is doing something totally unrelated. The confused user will often cancel out of the dialog, or select a certificate related to the thing they actually are trying to do, which the browser will “helpfully” cache for the rest of the browser session — and certificate selection state can only be cleared by exiting the browser! So (to give an example we recently discovered), a user who goes to our main Web site might look in the “Resources” menu for a link to a staff-only resource that is certificate protected. If that user is using a recent version of Firefox, merely hovering over one of these links will cause the browser to open a connection to the server (which is evil and vile and wrong for a whole bunch of reasons, not least of them privacy and security), which will cause a certificate-selection dialog to pop up for no apparent reason. If the user hits “cancel”, and they have the “remember this choice” checkbox selected (the alternative, “ask every time”, is even more horrible), they will not be able to access that service using their certified identity for the life of the browser session. (Many of our sites allow guest access, so they may be able to see the site but find themselves unable to do anything. But even on sites that absolutely require a certificate, Firefox will never present the certificate selector again, no matter what error is returned by the server, so the user is just stuck.)
It worked yesterday but now it doesn’t!
As something of a corollary to the previous problem, browsers tend to deal horribly with the fact that certificates expire, and not all that well with the fact that clients may have multiple identities attested by certificates from different authorities. Our experience (again, mostly in Firefox) is that browsers will cheerfully allow users to select certificates that are flat-out invalid, and then fail, even when the server clearly indicates that this isn’t going to work. Some other browsers (not Firefox in this case) ignore the list of acceptable client CAs sent by the server and allow the client to select a certificate that the server cannot possibly validate, or if they can validate it, contains no useful information about the user’s identity. (I’m looking at you, Safari!) It is perfectly legitimate to hang on to expired certificates (and their corresponding private keys), particularly if they are marked (as ours are) as being valid for use in email encryption, but users should never be invited to initiate a new communication, whether HTTPS or email, using an expired certificate.
We have this issue perhaps more than some other enterprises because our users have two distinct identities, each of which has an associated client certificate, and these certificates are issued by different CAs and expire at different times. (Our parent organization issues certs that expire annually on July 31; our CA — the one that I wrote — issues certs that expire 365 days after issue, or when the user’s account expires, whichever comes first.) We could and probably should fix this, by aligning our certificate (and account) expiration policies with our parent’s, but it does highlight the confusion caused by poor browser implementations.
So why do I think certificates are a good thing?
With all these problems, you can imagine that there’s a lot of pressure to stop using client certificates for authentication. Nonetheless, we continue to use them, and if the browser vendors can be persuaded not to break them, we will probably keep on doing so for some time in the future. We do have reasons for this, which I expect a lot of VC-fueled Web developers probably won’t understand (or will chuckle and say “oh, I remember when people used to do that…”). But client certificates really do solve real problems for us, for which there simply is no alternative.
Phishing
Our users are subject to many phishing and related social-engineering attacks, not all of which can be detected or prevented by our email system. Client certificates are supposed to be impractical to forge, and the browser-based UX for certificate enrollment, although it sucks mightly in many ways, is very difficult to counterfeit without directly compromising the browser itself. We regularly and repeatedly emphasize to our users that they should never under any circumstances give their password to any Web site, or store it in any sort of persistent storage, except when requesting a client certificate from our CA. We don’t even implement a password-change Web page, as our parent organization does: if you forgot your password, you’re going to have to bring a photo ID to the helpdesk to change it. If the users do as we ask (which of course we can’t ever guarantee), the CA is the only Web server that will ever have access to their actual login passwords, and since access to the CA is carefully controlled, the possibility of compromise is limited. Thus far we have never seen a phishing attack, not even a spearphishing attack, that actually walked the user through exporting their private key from the certificate store and transmitting it to an attacker. (Not to suggest that such things couldn’t happen, but even our least-sophisticated users would likely realize something was up. This is also the threat model which hardware security modules and smart cards were designed to address.)
Shared Web servers
The vast majority of our Web content is served directly from shared network filesystems by general-purpose Web servers running Apache. Users need to limit access to some of this content, and it is simply not safe for these servers to allow users to use their normal passwords for authentication. The combination of Apache .htaccess files and client certificates allows us to give all our users individual control over who is able to access their internal-use content, without expecting them to securely set up and manage password-based authentication — even if we didn’t know full well that most of them don’t have that sort of expertise. (We have a difficult enough time when they install third-party software that expects them to do this; our officially supported Web server platform shouldn’t lead them down a path we know to be a serious problem.) Many alternative solutions assume that Web servers run a single application with a unitary model of access control, such that URL-space can be neatly partitioned into “public” and “private” in a way that simply doesn’t fit this use case at all.
Offline, third-party verification
Unlike nearly every alternative that’s been proposed, X.509 certificates by their very nature allow offline verification. A Relying Party can validate a certificate without revealing to the Identity Provider that it is doing so, or that any particular user is being authenticated. This is an important and powerful privacy protection — even in the case, as in our organization, where our CPS explicitly says “if you aren’t part of our organization you shouldn’t be relying on our certificates”. Nobody needs my consent, or even my cooperation, to validate the certificates I issue: all they need is a copy of my CA’s certificate, which they can easily get from the CA itself or from anyone who already holds a client cert. (There are some legitimate issues surrounding certificate revocation which, depending on your security requirements, may require online checking — to preserve privacy, browsers could implement OCSP stapling for client certificates. We don’t implement OCSP, and while we do publish CRLs for all of our CAs, I’m not aware of any RPs, internal or otherwise, who actually consult them. In general, certificate validity is not a substitute for authorization checks, although many of our services are open to all authenticated users.)
User control of authentication identity
As I mentioned above, users can have multiple identities, and the browsers do a (just barely) serviceable job at letting them choose which identity they want to present for any particular service. This allows a Relying Party (including some of our servers) to authenticate users as members of our organization, or as members of our parent institution, depending on what certificates they choose. More generally, there is no assumption that the user has a single, unitary identity that will be used across every service (both Web and non-Web) that they need to authenticate to; they can choose the identity that is appropriate for the action they wish to undertake — just as they can with traditional username/password authentication. Furthermore, the certificate-selection dialog, at least as it exists now, is uncounterfeitable — there is no way for malicious JavaScript to trick the user into providing a certificate to a Web site without being aware of it. (This was a bug in Netscape and old versions of Firefox, which defaulted to “select a certificate automatically”, allowing citizens of some countries to be tricked into revealing their national-identity certificates to third parties.)
It already works
My entire enterprise CA, including revocation, enrollment, and CRL management, is implemented in less than 4,000 lines of Ruby code. This was only possible because the important parts — the basic X.509 functionality, SSL/TLS, and certificate enrollment — are implemented in established software systems with stable programming interfaces. Certificate enrollment, in particular, is a minefield: absent the Netscape <KEYGEN> element and related hacks, it would require multiple developers with a great deal of security and platform expertise to support enrollment across six platforms, five browsers, and a handful of non-browser applications — developers my shop just doesn’t have and never will. Any proposed replacement that isn’t cross-platform, requires a significant amount of browser-side programming, or doesn’t support non-browser applications, is a non-starter for us.
So what’s wrong with the status quo?
Given all this, you might ask why the browser people hate client certificates and want to get rid of the enrollment functionality. I have to believe a big part of it is simply that they don’t live in the sorts of enterprises that make extensive use of certificates. Most private-label CAs are implemented by Microsoft shops, and Microsoft supports these users with dedicated features and options in Windows Server and IE — but those shops are generally very large, have a significant investment in Microsoft-specific solutions, and are able to simply order their users to use IE for corporate business. As a result, there is probably less pressure on the other browser vendors (Apple, Mozilla, and Google) to get this stuff right, or even implement it at all — as witness the fact that Chrome never has. There are a few more specific objections that are brought up, which are for the most part specific to how certificate enrollment works today (with <KEYGEN> and related hacks), which are worth going over.
The UX is horrible
Well, whose fault is that? You’re the browser vendors, you control the whole UI, you can fix it.
It breaks the Same Origin Policy
This is a matter of dogma to the browser people approaching the status of a religion, and in the case of client certificates it misses the whole point of why we want client certificates in the first place. That being said, I can see no reason to object to implementing a mechanism that would restrict client certificates to a specific set of origins — either as an X.509v3 extension in the certificate (so it could be signed by the CA) or as part of the enrollment process, so long as the CA was free to say “any origin is OK by me”). This assumes, of course, that the principle of user choice is maintained: I should be able to choose, on a site by site basis, whether to present a certificate at all, and if so, which one. (And, unlike in all current browser implementations, I should be able to change my mind without restarting my browser session!)
It uses MD5
This is a very specific objection to the Netscape-originated signedPublicKeyAndChallenge object used in the <KEYGEN> protocol, and for some use cases it’s entirely legitimate. For a CA like mine, it’s totally irrelevant: the only thing I care about from the SPKAC is the public key itself, since I’ve received it over a secure channel from an authenticated user. The signature on the SPKAC object serves as proof that the user submitting the enrollment request is actually in posession of the private key corresponding to the public key. But you still have to actually have the private key in order to be able to authenticate to our other servers, so forging an SPKAC doesn’t buy you much (and leaves fingerprints behind in the CA logs). If you could forge an SPKAC corresponding to some other user’s public key, you would be able to pass off a document signed by that other user as your own, which is not trivial, but also not a significant threat in our environment where certificates are only issued to authenticated local users. For a public CA issuing certificates for code signing or privacy-enhanced mail, this is a serious issue.
The solution to this problem is pretty simple: get rid of the signedPublicKeyAndChallenge object and use a standard PKCS#10 certificate signing request instead — which is what Microsoft already does in IE. I’d argue that the way it should actually work is as follows:
- User authenticates to CA.
- CA generates an unsigned “proposed CSR” object, filled in with subject DN and extensions as the CA intends to issue the certificate (including origin restrictions as noted above).
- Browser presents a counterfeit-resistant dialog to the user indicating that they are about to request a certificate which will identify them as such-and-so and giving them the opportunity to refuse
- Browser generates a keypair according to the CA’s stated requirements and signs the CSR, then submits the CSR to the CA via a form submission or JS callback.
- The CA responds, perhaps immediately, perhaps at some remove of time, with a signed certificate.
Note that enterprise CAs will typically ignore most or all of the identity information contained in a PKCS#10 object in favor of out-of-band authentication (what’s what I did for old IE enrollment) — but having a CSR signed by the user with those fields in it is valuable evidence of consent, which the signedPublicKeyAndChallenge does not offer.
The certificate installation media type is a horrible hack
Yup. I’d be happy to do it some other way, so long as the client-side programming required for it is (a) minimal and (b) standardized across all browsers. Currently only Firefox supports it properly (as might be expected since they inherited the code from Netscape that was the original implementation).
We have business requirements for private keys that <KEYGEN> doesn’t support
This is the Microsoft argument, and it’s a legitimate one: one particular common requirement is that private keys be generated inside a smart card, TPM, or similar tamper-resistant device such that they cannot be exported from the device. <KEYGEN> has no way to express that, and that’s why IE never implemented it. There are also policies related to public key length (many CAs now require 2048-bit keys, but it’s impossible to tell the browser what length to generate), not to mention which algorithms to use. (I want my ECDSA!) It is probably impractical to capture all of the possible policies that an enterprise CA might want to enforce, but I believe most of them can be implemented with essentially two mechanisms: first, a way to require non-exportable keys; and second, a JavaScript callback that can filter a set of (implementation, cryptosystem, key-length) tuples prior to key generation. It may also be necessary to allow the use of keys which were previously loaded into a device rather than always generating new ones (e.g., if generating keys centrally for corporate key escrow). An enterprise CA might, for example, require non-exportable keys generated by a specific PKCS#11 smart-card module using the built-in PKI support on a company-issued Yubikey NEO.
Conclusions
I hope I’ve managed to describe some of the motivation for wanting client certificates to continue to work, and described some appropriate low-impact solutions to the legitimate issues that they do raise.
Update (2015-10-12)
Happy Canadian Thanksgiving! I see from the stats that this post got referenced on Hacker News, and most of the comments seem to be saying “Oh, simple, $HUGE_PILE_OF_CLIENT_SIDE_JAVASCRIPT will fix that.” You go build the necessary support for Apache to handle that on 100% static Web sites with .htaccess-based authorization and I’ll consider that a workable replacement. It isn’t today, and I’d be surprised if it’s workable even in two years’ time (given our server upgrade schedule).
Coincidentally, just this morning I got annoyed enough by the client certificate prompt on hover in Firefox to look up the bug report. It’s https://bugzilla.mozilla.org/show_bug.cgi?id=910207.
Unfortunately, the Firefox devs simply can’t seem to comprehend that hovering over a link does not equal consent to leave log entries on some random server.
Pingback: Lazy Reading for 2015/10/11 – DragonFly BSD Digest
Not just client certificates, we (in Europe) need support for digital signatures (XML DSig based).
There’s a big problem for dealing with gov sites and even some banks which require non-standard native (or messy Java) components for this. It’s non portable and also has security risks.
You expect a lot. Browser developers can’t even get DNS right and are still fiddling with HTTP, HTML, JS and CSS. I really wouldn’t expect web crypto and authentication to be anything but developmental for years to come. Baby steps, they’ve only been at it for a couple of decades.
All I’m really asking is “Don,t break shit just because you don’t personally use it, kplsthx!” That didn’t use to be too much to ask!
It’s not just corporates: recently we deployed Client Certificate authentication as a Single Sign-On system for the Debian community, and it solved way more problems than it created.
Just last night I summarised the situation here: https://lists.debian.org/debian-devel/2015/10/msg00134.html
Let’s not forget an even bigger reason: It is moderately easy to integrate secure tokens into the server security without lots of custom hacks. A USB CCID token with RSA support costs somewhere between 20USD and 50USD, depending on the specific model and size of purchase. Possibly even cheaper. It is easily supported by almost all operating systems. This doesn’t excuse the abysmal UX in many web browsers though. So I fully agree: there is a nice security feature. It can be integrated into an existing deployment with little cost and without having to modify applications unless you want to do additional verifications. The only participant that is a huge PITA is the web browser. Mozilla, I am looking at you!