Preserving unlinkability of accounts

Inspired by Kim Cameron’s most recent educational post on unlinkability, I am contributing another entry on the same topic. This post is concerned with the case where you, a “user” of services, have ongoing account relationships with different service providers. For the sake of simplicity, we’ll consider two such service providers here, SP1 and SP2.

In order to maintain account relations with you, both service providers need to “know” you. That is, they need to create account indices for you that are unique within their own domains. For example, SP1 knows you under your birthname “Robert T. Smith” and SP2 knows you as “Bob Smith”. Now, since there is a degree of correlation between the two names that you use with the service providers, there is a degree of linkability between your two accounts. Some of the attribute data in your accounts (i.e., accounts entries) may give away additional linking information. However, there is no guarantee for the two service providers that the two accounts actually correspond to the same person; absent an explicit guarantee to that effect by some trusted party, the two accounts may very well belong to different persons who happen to share a similar name. (More on this here.)

Enter (cross-domain) identity management. Suppose SP2 wants an identity claim from SP1 so that it can make a more informed decision when dealing with you. For example, it may want attribute information that SP1 holds in its account on you, such as your age or your good credit status. More generally, it may want attribute, authentication, or authorization statements (as defined in SAML). Clearly, the transfer of an attribute value itself between the two accounts will increase the linkability between the two accounts. The question we will consider today is:

Can we ensure that no linking information is conveyed beyond the value of the attribute?

To answer this question, we’ll start by investigating the typical user-centric transfer of the attribute information, depicted in the figure below. Here, the identity claim (containing “attribute information” requested by SP2) is sent by SP1 to SP2 via yourself, the user of the services. This “user-centric” data flow allows you to see the data, to direct it to the right account at SP2, and to consent to its release. To ensure the authenticity of the attribute information, SP1 must digitally authenticate the identity claim by digitally signing it (or perhaps by appending a MAC using a secret key shared by SP1 and SP2).

This approach does not meet our objective of minimizing the increase of linkability between your two accounts. In fact, it does the total opposite: with a single transfer of attribute information, your two accounts can now be linked on the basis of a (universally unique) common identifier. That common identifier is the digital signature of SP1, which is seen by both SP1 and SP2. This signature is a string of hundreds of digits, such as (in hexademinal) 76 12 5e 19 a5 36 e2 11 ea 14 45 b1 ba 12 e3 e2 d5 67 81 d1 1f bb 04 b1 cc 52 c2 e5 3e df 09 67 4f 07 52 70 36 f2 89 ec 98 09 bd 61 39 b1 52 07 48 9d 36 90 9c 7d de 61 61 3d 2b a5. (Technically, SP1 when digitally signing may issue either a signed message or a digital certificate; in the latter case, a user-generated public key is signed along with the attribute information itself.)

Consider, next, the alternative depicted in Figure 2 for sharing attribute information. In this approach, SP1 and SP2 communicate through a trusted third party (called the IdP in the figure). SP1 provides the identity claim to the IdP, which in turn provides it to SP2.

To meet the objective of minimizing the increase of linkability between your two accounts at SP1 and SP2, the IdP must see to it that no linkage-enabling information is transferred between SP1 and SP2 beyond the desired attribute information itself. As such, it may not forward any digital signature of SP1 on the attribute information. The best it can do is to re-sign the identity claim it receives from SP1 by discarding its signature and replacing it with a digital signature of its own. This approach is taken in the Liberty Alliance “circle of trust” model, and has also been suggested in the context of Trusted Computing and other efforts (under names such as “Privacy CA”). SP1 can even keep the IdP oblivious as to the attribute information that it transfers to SP2, by encrypting it in the identity claim under a key of SP2. Consider, however, the cost of this approach:

SP2 no longer gets SP1’s assurance as to the authenticity of the attribute information. Instead, it must completely trust the IdP. In particular, it must trust that insiders or hackers of the IdP did not modify the attribute information in the identity claim made by SP1.
The IdP learns in real time that SP1 and SP2 are communicating attribute information about one of the users they have in common.
The IdP must be relied on to be available.

While these may be perfectly acceptable for an SP2 that “controls” the IdP, by definition the IdP is supposed to act on your behalf, namely (1) to protect your privacy vis-a-vis the two service providers and (2) to direct the identity claim to your account at SP2. In essence, you expect the IdP to act as your “agent,” while SP2 expects it to act as its agent. Clearly, if the IdP is “owned” by SP2, then from your perspective as a user there is zero difference between this approach and the approach of the first figure. That is, the IdP must juggle competing interests that are hard (if not impossible) to reconcile: the security and privacy interests of SP2 on the one hand, and your own security and privacy interests on the other. In practice this would need to be done through legal contracts, insurance, assumption of liabilities, audits, and other “conventional” (non-technical) “trust establishment” techniques, which will remain far from perfect and are highly laborious and expensive. In short, the second approach represents a pyrrhic victory.

Fortunately, it is possible to achieve our privacy objective while at the same time achieving security, scalability, and autonomy for the service providers. The key is to use minimal disclosure tokens, as depicted in the next figure.

In this approach, SP1 issues the identity claim in the form of a minimal disclosure token. It contains the attribute data that SP2 needs to know, together with a digital signature of SP1 on that data (and, optionally, a user-generated public key). In contract to the naive digital signing approach, however, SP1 never gets to see its own digital signature, since it is in effect randomized (“blinded”) by the user (i.e., by software that runs on your own computer that, as far as SP1 and SP2 are concerned, may be fully under your control ). In case a user-generated public key is included in the minimal disclosure token, that key will remain invisible as well to SP1. More technically, SP1’s signature and your optional public key are generated uniformly at random from their respective probability spaces, while remaining unknown to SP1. As a result, when you yourself present the identity claim to SP2, no information whatsoever (other than the intended attribute information itself) is conveyed to SP2 that SP1 and SP2 can use to link their accounts on you.

There is one issue that I did not address yet: correlations through timing analysis. If the identity claim that is issued by SP1 is “transient”, meaning that it must be shown immediately (or at least in the same “protected” online session) to SP2, then SP1 and SP2 may be able to link their accounts on you through a simple timing analysis (as opposed to data flow analysis). To thwart timing analysis, the identity claim should be issued in the form of a “long-lived” identity claim. While the leading industry proposals for user-centric identity management currently provide no or only poor support for long-lived protected identity claims, there is nothing that prevents such, assuming the following issues are taken into consideration:

Since issuance of the identity claim takes place prior to its presentation, how does the user obtain an identity claim ahead of time that corresponds to SP2’s request? One approach is for SP1 to issue a multitude of statements; however, this can rapidly become inefficient, especially when the set of attribute types that SP2 may be interested in is large. On the other hand, lumping lots of attribute information into the same protected identity claim leads to the problem that all of it would need to be released to SP2 when using conventional signing techniques. This is where selective disclosure comes to the rescue: SP1 can issue one or a few minimal disclosure tokens that contain potentially all attribute information in its account on you (at a minimum, attribute information that may be of interest to other SPs); at any later time, you can then cryptographically prove the minimum attribute property to SP2 about the attribute information in your protected identity claim, without being able to corrupt the integrity of SP1’s signature.
To prevent a replay attack by SP2, SP1 should issue the minimal disclosure token in the form of a minimal disclosure certificate, rather than as a minimal disclosure “signed message.” That is, a user-generated public key should be signed along by SP1, and at presentation time SP2 should ask the presenter of an identity claim to sign a nonce using the private key that corresponds to the token’s public key. As mentioned, SP1 never learns any information on the user’s keys, so that our unlinkability preservation objective is still met.

Note that nothing prevents you from encrypting your long-lived identity claims under a key of your own (e.g., one derived from a quality password) and storing the encrypted blob remotely for on-demand retrieval.

This solution also shows that there is an affirmative answer to the often-asked question: is there a business case for minimal disclosure tokens? In environments where the degree of unlinkability between accounts in different domains must be preserved (whether for regulatory, competitive, or security reasons), the solution of figure 3 provides tremendous cost savings over that of figure 2, some directly measurable (cost of setting up a mutually trusted third party), others less tangible but measurable nonetheless (eliminating risks for SP2 and the business cost of establishing relationships between the IdP and the SPs and the user) . (There are lots of other business cases for minimal disclosure tokens, in fact, and I plan to get into some of these in future blog posts.)

Note that nothing prevents SP1 and SP2 from establishing a common identifier in the minimal disclosure approach: SP1 would simply encode that into the issued token in the form of attribute information. (With selective disclosure, you could then be given the option to hide or disclose that to SP2.)

That concludes today’s contribution.

One important question of an entirely different nature that I did not get into today remains: if SP2 needs a “good credit status” statement, say, on you from SP1, but you have a bad credit status, what prevents you from asking a friend to help you out? I’ll address this transferability problem in my next blog post; as we’ll see, minimal disclosure tokens provide unique benefits in this regard as well.

June 27, 2007 - Posted by Stefan Brands | General

No comments yet.

« Previous | Next »

June 2007

M T W T F S S

1 2 3

4 5 6 7 8 9 10

11 12 13 14 15 16 17

18 19 20 21 22 23 24

25 26 27 28 29 30

« May Aug »
Archives
Meta
Top Posts

The Identity Corner