The use of biometrics as identifiers in public and private applications, and even for personal use is ever increasing, bringing about challenges regarding storage and duplication. In addition, multiple biometrics for the same individual may be registered against different applications, with slightly different user profiles being recorded in each case. As the biometric sea grows wider and deeper, the situation becomes more profound, as Julian Ashbourn explains.

The biometric sea refers to a condition arising from the ever increasing use of biometrics as identifiers in public applications such as border control, licensing and identity documentation, aligned with an increasing usage in private applications, or even for personal use. The net result of this proliferation of usage is an equal proliferation of stored biometrics and associated records. Furthermore, this storage is not generally centralised but distributed, and often duplicated, according to application. In addition, multiple biometrics for the same individual may be registered against different applications, with slightly different user profiles being recorded in each case.

Numerous identity profiles
In simple terms, the greater the number of biometrics on record, the greater the likelihood of matching errors. This is especially the case where the reference biometric has been captured by different agencies, using slightly different techniques, and almost certainly, with different standards of quality checking. With respect to large scale applications, applying the same matching criteria to such a disparate collection of biometrics (assuming they have been gathered from other applications or documentation) may produce some interesting, and potentially misleading results.
As the biometric sea grows wider and deeper, this situation becomes more profound. The answer, as many will suggest, lies in correlation with other, non-biometric data in order to ensure alignment with the correct profiles. However, similar issues exist with non-biometric data in that they have usually been collected by various agencies, each with its own idea of what information is important. Consequently, a given individual will typically have numerous identity profiles for different applications, both in the public and private sectors. As the data are administered, copied, subsumed into other applications and generally manipulated, the propensity for errors increases accordingly. 

Error inducing processes
Now we have the dual situation of an increasing and diverse biometric data pool, balanced against a similarly increasing and diverse pool of non-biometric data. Correlating between the two may work well enough in most cases, but may occasionally result in a failure to correlate, even though the data do exist, or a false correlation. The problem with the latter is that a false correlation may go unnoticed for a considerable period of time, in turn, generating yet more errors as the data proliferates. When all this takes place within a single agency, or even synergistic agencies within the same country, it is complex enough. When data is shared across diverse agencies and across national borders, the situation becomes correspondingly more troublesome. It is an interesting picture. Now add the complexity of different languages and alphabets and one can begin to appreciate that accurate matching and correlation against large datasets is a complex business.
We may choose to normalise the data in some way, before subsuming it into a centralised database (or, more likely, a number of databases that consider themselves to be centralised). But how exactly will this normalisation process work and how will it handle errors? The very process of normalisation is likely to introduce errors of its own, unless a robust, close to real-time checking function can be incorporated and run against the original data. In any event, the normalised data will differ slightly from the host data. The question is whether such differences can materially affect the reliability of identity verification. If several agencies seek to create a centralised pool of normalised data, how will these normalisation processes (and therefore the resultant datasets) differ? If those centralised databases are subject to ongoing administration and manipulation, then there may be a danger that they drift increasingly apart.

Grouping of administrations
The situation described above suggests, perhaps, the need for standardised approaches to both data definition and data normalisation, even when applied to different database schemas. The question of feasibility immediately arises. Could we really come up with a set of standards and protocols which could be commonly adopted across diverse agencies? Technically, this would be entirely realistic, but of course, politics enters the picture at this stage and, doubtless, there will exist a plethora of different perspectives in this respect. However, it is likely that these different perspectives have more in common than they might at first suppose. In order to establish that fact, we need some clear proposals that can be aired at the appropriate forums. Such proposals might include the suggestion for a grouping of administrations, possibly by economic or geographic region, in order to systematically evaluate local schemas and identify opportunities for normalisation and eventual standardisation. If this were to prove successful at a regional scale, then the idea could be extrapolated internationally. With a little cooperation and coordination, such an activity would be entirely feasible. In this context, it would be especially valuable if we could also establish an over-arching, non-political vehicle with which technical proposals might be evaluated without prejudice and recommendations made accordingly. This requirement was previously articulated in the paper entitled The Future for Border Management1. Actually, there may be much that we could achieve in this context, if we have the will to do so.

Identity management
If we consider the consequences of failing to establish such a standardised approach, we may quickly appreciate that the likelihood for errors becomes heightened. As time goes on, errors beget more errors and the biometric sea becomes ever more cloudy. This is an inevitable consequence of not having an adequate control and methodology in place at the right point. Almost a natural, evolutionary process in fact, albeit one which doesn’t sit too comfortably with our ideals of identity management. It follows then that this idea of commonality of approach and associated standards, is one that we might usefully explore. Indeed, we should go even further back and clarify our intentions around identity management altogether. For example, for what purposes does identity management become pertinent, and to what degree? Different applications will suggest different levels.


 

However, if the methodology is common, or as common as we can make it, then we can just take from it the degree required to satisfy the requirements of the application in question. In such a way, we may also enhance privacy, as we can simply use unique identifiers as proxies where appropriate, without having to reveal swathes of information which are not pertinent to the transaction at hand. This would keep the waters of our biometric sea a little clearer than they might otherwise be. Indeed, with an intelligent use of unique identifiers, in many cases, the host data can be left at its point of origin, with no need to duplicate it across several databases, thus promoting both efficiency and data quality. Even a biometric check may not be necessary in many cases. When it is deemed necessary, then the appropriate mechanism may be invoked in the most efficient manner, using the unique identifier as a key. 

Address deficiencies
In 1951, Rachel Carson’s book ‘The Sea Around Us’ was published to great acclaim by the Oxford University Press. Within its pages, Miss Carson beautifully describes the processes, evolutionary pathways and the resultant life blossoming within our oceans. Oceanographers appreciate the complexity of these processes, relationships and dependencies which, in turn, interact with the broader geosphere, atmosphere and biosphere. Our biometric sea, while undoubtedly less beautiful than its marine counterpart, is nonetheless complex, especially when one takes into account the myriad relationships and dependencies which, together, constitute the whole. Within this sea, we could leave things open to evolutionary chance. In which case it will develop and evolve at its own pace, creating its own collection of data life forms. This would indeed be quite interesting, but would be less likely to afford us the facility and efficiency we are looking for with respect to identity management. In order to realise our aspirations in that respect, we require both an in-depth knowledge of the biometric sea and its workings, and a methodology with which we can orchestrate it into the future. Currently, our understanding is not quite as deep as it might be, and we have rather too many methodologies, none of which is exactly right for our purpose. We need to address these deficiencies if we are to make intelligent progress. 

Variables and inconsistencies
Much of the discussion so far has been aligned with biometric and related data processing in isolation. In practice, many of our transactions will involve human interaction. This reality opens another Pandora’s box of variables and inconsistencies. Chief among these is the understanding that, even if technical landscapes were identical, the variances introduced as a factor of human psychology are significant. Much of our understanding of biometric identity verification performance assumes a fairly constant capture situation. In practice, this is rarely the case. The combination of user psychology, environmental conditions and technical performance (often aligned with configuration) introduces variables which may or may not serve to confuse the biometric identity verification process. If such a confusion results in a failure to match, then we tend to notice it. If it results in a false match, we tend not to. Furthermore, the complex assemblage of variables which contrive to cause such a confusion, is in itself variable, due to factors such as aging and the relative stability of biometric traits. The genetic mechanism ensures that no biological organism remains absolutely static throughout its lifetime. This includes the phenotypical characteristics we choose as biometrics. If we decide, as a result of this reality, to periodically re-register the biometric, then associated transactions and their stored history will become even more colourful, especially as the periodicity of re-registration varies among applications and administrations. Even in a match of one stored biometric against another (or a list of others) we have the temporal issue to contend with.
A mismatch across time constitutes another type of error, albeit one which is rarely acknowledged. Indeed, transactional irregularities such as described, flow like plankton into the biometric sea, muddying the waters and becoming a source for potential future confusion. Consequently, when our broader correlation involves transactional correlation, this may lead to unexpected results. The solution to this dilemma lies in the development of a deeper understanding of these variables, particularly those aligned with user psychology, which may, in turn, be translated into systems configuration parameters, providing a closer integration of theory with reality.

Conclusion
The objective of this short paper has been to focus our thinking upon the need for repeatable, common processes and attendant protocols, which together will serve to bring a correspondingly closer alignment across agencies with respect to identity verification and the use of biometrics. This, in turn, will serve to rationalise our assumptions and increase efficiency, and in addition, the need for a deeper understanding of non-obvious transactional factors which, again, may be shared across agencies. When we achieve a closer harmonisation of these things, the waters of our biometric sea will become clearer. The primary question, perhaps, is what will be the vehicle that drives this forward? If it is to be an existing body, then this will need to be communicated and understood. If it is to be a new, impartial body, then this needs to be proposed and approved. Whatever the arrangement, without a clear and defined vision, the management of our biometric sea will become muddled. If this is allowed to continue for too long, the process of subsequent clarification will become increasingly difficult. Such a situation would serve no-one well. It is surely time therefore, to consider such matters while looking to the future.