In this era of rapidly increasing populations, trade, and technology, the need to accurately, confidently, and promptly identify individuals is paramount to any organisation. Similarly, the advent of the internet as well as globalisation have generated unprecedented amounts of data to analyse, which is an equally important process for organisations. In this article, Arnoud van Zuijlen and his team explore Unique Identity and Big Data, two practices that his firm distinguishes as service areas, in a shared context to provide a perspective on what a ‘Big Data for Unique Identity’ system looks like, and how it addresses some of the most relevant challenges that businesses face today.

A wide range of identity services exist to help answer the most fundamental question in a world of proliferating online identities – “Who are you?” – and to effectively establish ‘Unique Identity,’ whether with customers or with citizens.

figure 1. Factors of Unique Identity.

Unique Identity
Unique Identities can be established based on a number of identity factors: something a person has, something a person knows, something a person is, something a person does – or a combination of those four (see Figure 1). On top of this base set of identity factors there is a complex web of relationships and historical interactions that help to shape us as individuals. The Unique Identity Service Platform exploits all of these traits, connecting elements across the spectrum and rendering a person-centric view of the actors that interact with business systems. In this identity world, data can be drawn from a wide range of sources and encounters, such as biometric interactions, biographical data, identity documents, social media, and event history to aid the emergence of a current and holistic perspective of a person’s identity (see Figure 2).
   The concept of Emergent Identity drives our vision for a more robust and business enabling identity eco-system – the objective of which is not simply to amass personal data, but rather to securely build enriched profiles with only the relevant data that represents the true identity of an individual most accurately. The march-to-maturity of disruptive identity technologies, such as biometrics-on-the-move and automated document authentication has greatly advanced progress towards this goal. One look at the identity marketplace confirms that the space is expanding more rapidly than ever, with many new and exciting identity technologies, services, and applications. 
   Behind this, common to many disrupted industries, the data challenge looms. Today, a basic level of identity data is a fundamental requirement for any kind of interaction. Beyond that, whilst some prefer to minimise the usage and storage of identity data for privacy reasons, the majority of users enjoy the customised, streamlined services that sharing enhanced identity data can provide. We believe that today’s businesses must find ways to exploit the ‘Big Identity’ opportunity by hooking up the latest biometric and biographic identity information, effectively analysing the wealth of available data, and linking trusted information sources for improved business intelligence.

figure 2. Emergent Identity.

Big Data
The concept of big data is typically characterised by three key attributes:
• data variety: the diversity of sources;
• data volume: the amount of data to be handled;
• data velocity: the speed at which the data must be processed.

In addition to these core ‘three Vs’, there are other characteristics that further complicate the picture, such as:
• data value: what is the value of the processed data?
• data veracity: is the data to be trusted?
• data motion: will data come
in batches or streams?

These data-relevant variables and others like them are key considerations to keep in mind when scoping an appropriate big data solution. 
Discovery activities, steered by quantitative analyses, that reveal patterns, trends, and associations amongst the data are the key to deriving value from large datasets. However, to unlock the value hidden in the data, organisations must start treating data as a supply chain. A modern data supply chain begins when data is created, imported, or combined with other data. It then incrementally acquires value as the data moves forward through the links in the chain. Hence, a big data platform must be well-configured in order to effectively manage the variety, velocity and volume of data flowing through the chain.

figure 3.Big Data analytics methodology.

This process of expediting the data’s progress or ‘data acceleration’ plays a major role in discovery by diminishing data-related challenges such as movement and processing. For instance, tagging and filtering should be employed as data acceleration strategies to best direct the traffic of data. This allows for improved data veracity through efficient separation of relevant information from noise. If we consider three meshing gears to symbolise Accenture’s big data analytics methodology, data acceleration would be what maintains the gears’ smooth and frictionless motion (see Figure 3).
   Of course, the chain ends with actionable insight, which is applied to yield business value, typically by optimising and influencing future decisions.

figure 4. Today’s biometric technologies.

Identity data is big data
As the uptake of online services by citizens and consumers soars, and the demand for truly strong identity rockets across industries, the identity data in circulation exhibits more big data-like trends. In many scenarios, the variety, volume, and velocity of identity data either already meets or closely approaches conventional big data classifications. This converging relationship presents a fertile frontier for disruptive change in the Unique Identity business.

Variety
Identity data sources have become impressively diverse in recent years. Firstly, the range of identity sources has grown dramatically, with new sources coming onstream – the social media explosion has the potential to yield vast expanses of personally identifiable data, for example. Secondly, the ability of technologies to obtain identity data from previously unproductive data sources has improved, algorithms have become dramatically more adept at mining identifiable data from multimedia. And thirdly, the maturation of biometric technologies has meant that there are many more options than the traditional ‘fingerprint, face, or iris’ open to system designers; behavioural, 3D face, voice, ECG, and even eye-vein modalities are increasingly viable (see Figure 4). These trends are driving a huge increase in the variety of identity data that today’s and tomorrow’s systems must handle.

Volume
The days of ‘username and password’ are over. An increasingly varied identity data landscape trends towards a corresponding growth in identity data volumes. Beyond that linear trend, the sharp uptake of multimedia usage identity purposes (whether overt or covert) causes ballooning data sizes and also naturally affects identity data size. Overall, biometric usage is becoming remarkably commonplace as biometrics quickly expands to bold new territories such as mobile devices. While fingerprinting technology is quite mature, using it to unlock your mobile device is certainly a new application. The growing need for digital security is ushering in biometrics as a norm, and the trend will likely continue to spread to tablets as well as other platforms. Additionally, early adopters of biometrics also contribute significantly to the existing volume of identity data. For example, government institutions such as the Department of Homeland Security and the Federal Bureau of Investigation each manage hundreds of terabytes of biometric data. One thing is certain: whether it is on new or established ground, the size of existing identity data will only continue to increase, and at a considerable rate.

Velocity
Along with the diversity and abundance of identity data, the need to use this data faster and more pervasively throughout business operations has also surged. Public sector and commercial entities alike are increasingly considering identity information essential to their core operation. Governments are more closely relating citizen identities to biometric data, and commercial businesses are leveraging unique identity benefits to improve customer experience at authentication. Recent research reports estimate that global biometrics market revenues will reach USD 20 billion by 2018. These investments and market size figures further support trends that indicate snowballing momentum for improved identity services, which portends rising rates of identity data production and processing.

Value creation
The aggregate body of identity data that currently exists may have the variety, volume, and velocity indicative of big data, but this wealth of information is largely meaningless as long as it remains unanalysed. Advance-ments in scalable data warehousing such as Hadoop, elastic cloud architectures, and passive biometric recognition such as iris-at-a-distance make a system where big data and unique identity complement each other ever more practical. Figure 5 below serves as a very high level example of a solution that integrates unique identity and big data capabilities. This illustrative overview includes previously mentioned technologies i.e. cloud computing and distributed data frameworks – that are emerging as key aspects of big data systems. However, it should be noted that these components are not necessary for architecting a functional solution.

figure 5. Sample solution overview.

Business scenarios
Border control: an illustrative example From a unique identity perspective, border agency staff members may not be tasked with visually verifying each traveller’s identity. Alternatively, the traveller’s identity can be biometrically verified against the passport photo contained in an ePassport through facial recognition software, which is far more reliable than human judgment. Moreover, if the identity solution in place employs an emergent identity approach, the system can use previous identity events from the individual’s data profile – for example prior encounters at the border, criminal history, and tax delinquency – to make better informed decisions regarding entry. 
   Another apparent business benefit is improved resistance to circumvention: big data analysis will reveal patterns of individuals attempting to defraud the system, which can identify and expose previously unaddressed system vulnerabilities, such as insider threats.
    Additionally, the platform can leverage data streams from various biometric solution elements across the system to easily calculate metrics such as false acceptance rate (FAR), false rejection rate (FRR), failure to enrol rate (FTE), and mean time to failure (MTTF).

Here are a few examples of why this is significant:
• Such metrics are input for continuous performance optimisation/improvement of the system.
• Metrics such as FAR, FRR and FTE can be used for guidance on future vendor selection and product procurement decisions for biometric algorithms and devices/models.
• MTTF patterns can predict when different biometric sensors are likely to fail, which allows for proactive instead of reactive agency protocols.

This small number of useful applications from a handful of metrics represents only the tip of the iceberg. Indeed, pairing big data capabilities with identity solutions presents significant value and can be applied far beyond this specific example to a variety of other business scenarios. 

Banking
Like governments, more and more corporations are implementing identity solutions that feature biometric technologies. This means that many of the previously referenced big data advantages also pertain to commer-cial enterprises. This relevance will certainly continue to swell as these large-scale systems become more prevalent in the private sector. In both of these cases, as with others, unique identity capabilities are becoming a vital element of operations as a means to enhance security and the user experience while reducing TCO.
For the financial services industry, extensible identity services are inherently linked with business success. To combat in-person and online identity fraud, major banks are now pioneering concepts such as finger vein ATM access and behavioural biometrics as a secondary layer of intuitive authentication. With millions of users, thousands of devices, and consistently heavy traffic, these systems are particularly ripe for a big data-centric identity solution.

Conclusion
In both public and private sectors, Accenture’s Unique Identity service and its powerful concept of Emergent Identity lead the charge on the frontier towards building advanced identity solutions. It is clear, however, that the role of big data is becoming increasingly relevant and compatible with our mission. The added benefit of big data to the platform stems from analysing the wealth of identity data interacting with the system, and utilising the insights gleaned to enhance operational efficiency. This marriage of unique identity with big data promises tremendous business value and sets the path to our vision of a more robust identity ecosystem.