We don’t need Unique Patient Identifiers, we need ‘proof of healthcare identity’ credentials

In my last post, I maintained that individual identifiers, such as a username or user ID, as used in Informational Identity Management – legal, transactional and digital identity management – are fundamentally a kind of credential.

With the advent of “…a standard unique identifier for an individual…”, commonly referred to as a Unique Patient Identifier, or UPI, looking more possible than it has since the fall of 1998, let’s look at individual identifiers more closely and see what we might learn to help us understand the value of UPIs.

First I need to make sure you are onboard with the fundamental credential value of user names and user IDs. If you are not, let’s do a couple of thought experiments to persuade you.

You call a busy restaurant – say Atelier Crenn in San Francisco – and make a reservation under the name ‘Mark Twain’. When you arrive, just the name gets you access to the privilege of being seated for dinner.

Ok, not very persuasive. Try this. You go to log in to a website and are prompted to type in your username and password. Both – in our imaginary site – are masked as you type them in. You have to remember both of them to gain access. Once in, you can go to your profile and change either or both of them to new values – some sites let you choose a new username – our imaginary one does. For each, there is some rule applied to validate your new choice. You may be prompted to try a different value to pass the rule.

From your perspective, what is the difference between the username and password? You keep the password secret, you might say, but not the username. You could keep them both secret and not change their nature. The domain does not keep your username secret, you say. They could, and they would work the same way. The username doesn’t change, you say, so that if you lose your password you can get a new one associated with the same username. You can lose your username too. Sites generally make you change your password if lost, but ‘recover’ a lost username. Again, not logically necessary – you could keep the same password, or they could make you pick a new username. They usually don’t, for reasons we will touch on later. But they could.

Wait, you might say, you have been talking about usernames. What about user IDs?! They are just alike, except that user IDs are generally chosed by the domain, and usernames by the user. But if the user chooses, the domain ensures the uniqueness of the name. Persuaded, mary_jones_23@bigdomain.com?

We could go on. But I think I’ve made my point.

As I described in that last post, resource domain owners want to be able to track the behavior of individuals accessing domain resources. They want to advertise to the individuals. They want to bill the individuals. They want to slice and dice data about the individuals by common demographic attributes such as age, zipcode, and so on.

To achieve this, the credentials have to be unique. If one credential, unique. If more than one, unique in the aggregate. But unique in the aggregate can be achieved by having any subset of them be unique. And, practically speaking, it is easier for both the domain and the user to have a set of one be unique – as I said in my last post, to imbue that sole credential with the ‘pointing-ness’ function, the deictic function, of the identity record.

Why?

In digital domains, a unique username or ID helps the domain owner manage the user record. Most digital user records are stored in relational database tables, where the uniqueness of a column can be enforced with a database constraint rule. Since it is unique, it works as the primary key of that table. That makes it easy to associate other tables, such as user demographics, to it with foreign keys. Which helps you normalize your records about the individual.

But once you have done that, it can be problematic to change that value – you have to cascade the key changes across all the ‘child’ tables, which can be problematic. That necessary cascade is one of the reasons artificial keys won out in the natural vs. artificial keys debate in the relational community in the 80s and 90s. User IDs are artificial keys.

So user IDs, once created, are, in many domains, never changed. (In others there may be an internal-use-only unique ID for the identity record, and the username may be a discrete column with a unique constraint.)

Whether digitally managed or not, user IDs can be issued sequentially from some starting value in some range of potential values. The domain just needs to keep a counter of the last number issued, and add one to it. This is known to be problematic, of course, if the range of potential values is not large enough to meet the eventual number of users. Issuing sequential numbers as user IDs is now commonly understood to introduce an additional security risk: if a nefarious agent discovers one, they can infer many more. So user IDs are today more frequently generated by combining data values from a number of domains which when used together achieve a statistically high likelihood of being unique, such as UUIDs. There is nothing preventing the domain from having both usernames and user IDs.

Having only one credential be necessarily unique also makes it easier to manage a user’s having lost one of the other credentials. If they remember the unique credential, the domain can find their identity record, which frequently gives them access to both demographic data as well as communication keys such as phone numbers, email addresses, and so on, which can be used in the lost password process to re-proof the individual before issuing new credentials.

We have spoken to the true credential nature of unique identifiers. We’ve seen why domains issue unique identifiers.

But in order to really understand unique identifiers, we also need to know what they mean. All words, all symbols have meaning – it is why we use them. What does a unique identifier mean? (I first spoke about this in a presentation on Identity and Interoperability I did at an identity pre-conference at WEDI National 2016.)

Answering that will require a brief excusion through semantics.

Does the word ‘wolf’ mean “a member of the set of all wolves”? Or does it mean “a wild animal with sharp teeth the better to eat you with”?

Is meaning in the world? Or in the mind? By ‘reference’, or by ‘sense’?

Though the academic debate continues to rage, ok, simmer, it is pretty clear there is not a single answer. The answer is different for different kinds of words.

The category of proper names has been particularly problematic in this regard. We can’t get their meaning by ‘sense’ – I never met Albert Einstein. But I know he existed. And we can’t get it by ‘reference’ – Archibald Leach and Cary Grant? Same guy.

Our understanding of the meaning of names was clarified in the 1960s by the philosophers Saul Kripke and Hilary Putnam (preceded, as it turned out – science scandal, anyone? – by Ruth Barcan Marcus.)

Proper names, according to Kripke, are ‘rigid designators’ – they are terms that refer to the same person in all possible worlds. (‘Possible worlds’ is a philosophy term with thorny debate baggage of its own. For now think of it like this. In the actual world, Trump is president. But had a few votes gone differently in the Rust Belt, Hillary might have been president. The Trump-is-President world and the Hillary-is-President world are two ‘possible worlds’. But in both of those worlds, and in every other ‘possible’ world, 3 x 9 = 27, and e^x is its own differential.)

The reference of a name is set when that name is first assigned to or associated with that individual – usually by their parents. After that is remains associated by the chain of transmission of its use – by communication using that name referring to that individual.

So, Kripke holds, names are not descriptions. They are closer to ‘indexicals’. Indexicals, or ‘deictic’ terms, get their meaning by reference to a location in time or space from the perspective of one or more of the people using them in communication: ‘this’, ‘there’, ‘you’, ‘me’.

‘Marion Mitchell Morrison’ was that person to those who knew him directly or indirectly by that name. He was also that (same) person to those who knew him as ‘John Wayne.’

Unique identifiers are like proper names. ‘mary_jones_23’ means that person of all the persons with email privileges at bigdomain.com.

The domain ‘knows’ that that person logging in has this set of privileges. That may be all they know about them. They may also know that that person is named Mary Jones, and lives at 123 Elm Stree, Smalltown USA, so they can send her statements. And by knowing she is that person, they can track her behavior in their domain.

But remember our Archibald Leach – Cary Grant example, perhaps more frequently referred since Frege as the Morning Star vs. Evening Star problem (both are Venus).

If the resource domain wants to be sure that an individual is not only unique in the domain, but unique among all Informational Identities, they have to take additional measures.

There are two approaches. They can verify that the applicant does not already have an domain record. Or they can do a ‘fix up’ on the back end and operationally and/or analytically identify matching accounts and merge records as needed.

In order to do verify the applicant does not already have a record, they need to be able to identity match the applicant at proofing, which means storing necessary and sufficient information with the record to bind the individual to their Information Identity graph. The most straightforward way is the use of credential identifiers such as passport numbers. Demographics can be used if credential identifiers are not practical.

There is another way to solve the unique individual at registration challenge. Usernames and user IDs are not the only unique identifiers used in Informational Identity management.

The property of being unique among the individuals with domain privileges can be achieved in two ways. The value can be unique within the specific resource domain. Or it can be unique in a larger domain that is known to be a proper superset of the resource domain.

Biometric credentials also serve as unique identifiers. The domain within which they are unique is the domain of all humans, a proper superset of any domain identities. They are also rigid designators.

With biometrics, with usernames, with user IDs, like with names, the vital moment is the christening. You need to make sure you are naming the right baby. If you believe you are storing the Prime Minister’s retina scan credential, but you are actually storing Ethan Hunt’s, your security will self-destruct in five seconds. Good luck.

But once you have done that, you can check the new biometric against those in your existing records to see if there is already a record for that individual.

That is all easier said than done, of course. Widely available biometrics are fingerprints and facial recognition. But in the overwhelming majority of implementations those records are kept local to the user device and not transmitted to the back end where they might be persisted.

Let’s look at a typical digital identity record and break it down into its constituent parts.

Typical Digital Identity Record (OK, maybe a passport number is not typical. Trying to make a point here : )
User ID (Unique)
Password (Hashed)
First Name
Last Name
Gender
Email Address
Street Address
Phone Number
Passport Number
Privileges
{ READ_ONLY | READ_WRITE | ADMIN }

Typical Identity Record, Functionally Parsed (Some attrs may appear more than once)
Indexical (the minimum set of stored credentials providing uniquenes – ‘points’ to the user in the world. )
User ID (unique)
Stored Credentials (required for authentication of user. unique in the aggregate. at least one credential must be present for tracking domain-unique individual behavior, communication, etc.)
User ID (unique)
Password (Hashed)
Demographics (user for personalizing apps and communications, analytics segmentation, etc.)
Preferred Username
First Name
Last Name
Gender
Communication Keys (used to communicate with or at users)
Email Address
Street Address
Phone Number
Stored Proofing Credentials (used to verify new unique individuals at registration – assumes here all candidates must provide passport at proofing)
Passport Number
Authorizations (privileges granted to the user after authentication. necessary unless all domain resources auth’ed. )
Privileges
{ READ_ONLY | READ_WRITE | ADMIN }

Having laid this groundwork, let’s look at HIPAA UPIs.

A HIPAA UPI is (will be) issued by a government agency or their proxy. It will be unique within a domain intended to be a proper superset of all of the discrete domains of healthcare – providers, payers, clearinghouses, HIPAA business associates, health 2.0 app builders, health telemetry vendors, others TBD.

The only two way to confidently do that is to have it be unique among all existing Informational Identities: each unique Informational Identity gets only one, and that one is unique. Even if it is unique across an even larger domain, such as all humans, or unique in a domain purpose-built for uniqueness, such as UUIDs, we would want a 1 to 1 association with a discrete Informational Identity.

National Provider Identifiers hit the shoals on this. An individual doctor could get an NPI as an individual provider. But if they were incorporated as an LLC for billing, they could get another one. But then it was not always clear procedurally which one was called for by payers. Which sowed confusion. There was a lack of clarity in defining the domain of uniqueness for NPIs.

The individual applying for the UPI will need to be strongly proofed. The quality of that proofing will need to be demonstrably very high. Consistent credentials will have to be required at proofing, and their values persisted by the National UPI Enumerator for use in disambiguating humans at UPI registration – otherwise, as we have seen – as described in my last post – it devolves to demographic attribute matching, which is problematic.

And once the UPI exists, it can then in turn be made a necessary ‘secondary value’ credential used at identity proofing when registering an individual user. And in turn – turtles going up – if the UPI is persisted as part of the user’s identity record, a simple search of existing user accounts at proofing will ensure with reasonable certainty that the individual human is unique within the domain. The healthcare domain will not have to ‘step down’ to the underlying demographics to do matching. And even without additional demographics this will enable unique human user-based analytics. Along with demographics and communication keys it will support richer unique-human based operations and analytics.

So how do you validate at proofing that a UPI actually belongs to the applicant, that is was bound to the Legal component of the applicant’s Informational Identity when it was issued to them?

‘Something or Someone You Know’ credentials are of three kinds: stored values, frequently secret ones, verifiable private knowledge, or a human ‘referee’, usually able to authoritatively demonstrate their identity. A UPI is none of those.

‘Something or Some Way You Are’ credentials are biometrics and behaviors. A UPI is neither of those.

‘Something You Have’ credentials are documents, dongles, keyfobs, keycards…a UPI is, in and of itself, again, none of the above.

So what kind of credential is it? If it just a number you get issued, like a Social Security Number, it is a public knowledge credential. It has as much probitive value as data that is on the dark web. How do you validate a public knowledge credential? You have to have some way of looking up the number and its associated demographics. Then you have to do attribute-based matching against attributes provided by the applicant.

Clearly once you have the UPI stored in your domain, you can use it for matching records with other domains who have stored UPIs. But how do you validate it before you write it down?

The UPI needs to be a ‘proof of healthcare identity’ document…

It really needs to be a ‘something you have’ credential containing a biometric that can be positively verified. The UPI needs to be a ‘proof of healthcare identity’ document with the probitive value of a driver’s license or passport.

If it can’t be readily validated, it will not serve. And we need it – until we have such an identifier from a common identity superset across all healthcare domains, we will be forced to continue to resort to attribute-based patient matching, which will continue to be problematic. Stay tuned.

Leave a comment