After the international standardisation working group responsible for machine-readable travel documents (ISO/IEC JTC1/SC17/WG3) raised the question whether magnification distortion of facial images affected the accuracy of facial recognition, a study was conducted to determine to which degree the algorithms used for face comparison are affected.

This study was described in a previous issue of Keesing’s Journal of Documents and Identity.[1] It involved photographing several hundreds of enrolees at various distances (0.5 m to 3 m) with an automatic bench. One of the working group’s findings discussed in this article is that magnification distortion of facial images due to pictures taken too close up does negatively affect facial recognition. However, for distances over 0.5 m the effect is very limited.

figure 1.

What is magnification distortion?
Taking photographs of subjects at close range results in facial images affected by magnification distortion. Figure 1 illustrates this magnification distortion. Let the distance between the tip of the subject’s nose and the lens (a standard lens, without any telecentric property) of a camera be d, the distance between the eye plane of the subject and the lens of the camera, i.e. the camera-subject distance, be d + ∆d, and the height of a structure at nose level be h. Then, the structure at nose level appears to have the same size as a structure of the CA_8015_f1height h + ∆h at eye level. The magnification distortion is defined as 
if the distance between the nose plane and the eye plane Dd is 5 cm, the magnification distortions are as in Table 1.

Methodology from enrolment to score calculation
A bench capable of rapidly taking pictures of the same subject under strictly controlled conditions (see Box 1 and Figure 2) had to be created. The bench ensured that for all facial images the capture conditions were the same and in line with the provisions specified in Box 1, except for the camera-subject distance, which ranged from 0.5 m to 3 m. The ten distances were 0.5 m, 0.6 m, 0.7 m, 0.8 m, 0.9 m, 1.0 m, 1.5 m, 2.0 m, 2.5 m, and 3.0 m.

Databases
Using the bench, facial images of as many volunteers as possible were taken at various camera-subject distances and collected in local databases at the premises of members of the study group. The group consisted of KIS S.A.S. and Oberthur Technologies in France, Photo-Me International in the UK, Gemalto in the Czech Republic, Fotofix Schnellphotoautomaten in Germany and Nippon Auto-Photo in Japan. Each test subject participated in only one capture session. The captured facial images were cropped and resized in conformity with the ICAO Draft Technical Report ‘Portrait Quality’ (i.e. ‘ICAO cropped’) and stored in the JPEG file interchange format.

figure 2.

The local databases were encrypted and then sent to the Biometrics Evaluation Laboratory at the Fraunhofer Institute for Computer Graphics Research IGD. There they were processed using various state-of-the-art face recognition algorithms.

The local databases were merged into one consolidated database containing 20 ICAO-cropped facial images of each of the 435 test subjects, i.e. 8,700 images in total. The filenames of all facial images were pseudonymised in such a way that it was not apparent from the file­names which facial images belonged together and at which camera-subject distance they were captured. The consolidated facial image database was divided into a directory of reference facial images and a directory of probe facial images. For each test subject, the directory of reference images contained ten ICAO-cropped facial images from the first pass (one per camera-subject distance), and the directory of probe images contained ten ICAO-cropped facial images from the second pass (one per camera-subject distance). The consolidated facial image database was used for the official run of face comparisons.

Each participating face comparison algorithm provided Fraunhofer IGD with two software interfaces for its algorithm. The face comparison software was executed there on the facial image database.
Each participating face comparison algorithm compared the features of each reference facial image with the features of each probe face image. This means 4,350 × 4,350 = 18,922,500 comparisons. For each comparison the file name of the reference image, the file name of the probe image and the comparison score were recorded in a CSV file.

Data analysis
Methodology
The research hypothesis was that either the camera-subject distance of a facial image, or different magnifi­cation distortions of a facial image and of probe facial images compared with that facial image, had an effect on the usefulness of that facial image as a reference image. The usefulness of biometric samples for dis­tinguishing between mated and non-mated samples is referred to as ‘utility’[3]. If the camera-subject distance of reference images had no effect on the utility of at least one of the participating state-of-the-art face comparison algorithms, the research hypothesis would have been refuted.

False Non-Match Rate at fixed False Match Rate (FMR)
For most of the participating algorithms, the highest non-mated similarity score was lower than the lowest mated similarity score, i.e. the distribution of mated scores was clearly separated from that of non-mated scores, allowing perfect classification by setting the decision threshold between the two distributions of scores. Regardless of the allowed value for FMR > 0%, no false non-match error was observed in 43,500 mated comparisons. According to the ‘Rule of 3’[4] if no error is observed in N independent observations, then with 95% confidence the actual error rate is less than or equal to 3/N. Thus, with 95% confidence FNMR ≤ 0.0069% for most of the participating algorithms.

Distance-related differences in score distributions
Distance-related differences in score distributions can be significant in real scenarios where mated similarity scores are lower because of other factors affecting recognition performance (such as aging, pose variation, and illumination).

A measurement of how well the distributions of mated and non-mated comparison scores are separated is d’ (pronounced ‘d-prime’), defined as CA_8015_f2 ,

where:
• µm is the arithmetic mean of the mated comparison scores;
• µn is the arithmetic mean of the non-mated comparison scores;
• σm is the standard deviation of the mated comparison scores;
• σn is the standard deviation of the non-mated comparison scores[5].

figure 3.

Figure 3 shows the average d’ values over the three best participating commercial face comparison algorithms as a function of the camera-subject distance of the reference image and the camera-subject distance of the probe image. The individual values are represented as colours. The lowest value is mapped to dark blue and the highest value to dark red.

Practical interpretation of d’
Let Un be a random variable representing a non-mated comparison score. Its mean is µn and its standard deviation is σn. We assume that the probability distri­bution of Un is a normal distribution fn. Let Um be a random variable representing a mated comparison score. Its mean is µm and its standard deviation is σm. We assume that the probability distribution of Um is a normal distribution fm. Furthermore, we assume that the probability distribution of Un and the probability distribution of Um are independent.
Then, X = Um – Un is also a random variable with a normal distribution fx. Its mean is µx and its standard deviation is σx. We can determine the mean and the standard deviation of X:
µx = µm – µn
σx =CA_8015_f3

Figure 4 shows examples of mated and non-mated score distributions and the distribution of differences of mated and non-mated scores.CA_8015_f4In our study, d’ Є [12,7; 14,8]. So,
, i.e. |µx| > 12 · σx.
Let X be normally distributed with mean |µx| = 12 · σx and standard deviation σx. Let X’ = . Then, X’ is normally distributed with mean 0 and standard deviation 1, and according to the definition of normal distribution P(X’ ≤ –12) = 2,15 · 10–32.
P(X’ ≤ –12) = P = P(X – 12 · σx ≤ –12 · σx) = P(X ≤ 0).
So, P(Um – Un ≤ 0) = 2,15 · 10–32.

figure 4.

If a reference facial image is compared with N facial images, an algorithm returns N scores S1 to SN. We sort these N values in a decreasing order to get S’1 to S’N (S’1 is the highest score). A d’ value over 12 means that the probability to have the highest score not correspond to the right person is below 2,15 · 10-32.

If N is the world population of 7∙109 people, the probability to have one individual wrongly classified due to magnification distortion is 7∙109 ∙ 2,15∙10−32 = 1,51∙10−22. So, in practice it will hardly ever happen.

Conclusions
For the participating face verification algorithms, camera-subject distance does not have a great influence on face verification performance. It is worth mentioning that the average d’ value is not at all sensitive to a reference distance above 0.7 m, i.e. below 7.1% magnification distortion. Even at 0.5 m, i.e. at 10% magnification distortion, d’ decreases with only 15%.

So the recommendation derived from the distortion study is to use enrolment systems that create reference passport photographs with a magnification distortion of less than 10%. For an effective automatic face comparison a target of 7.1% would be ideal. The same recommendation applies to probe systems, for example at border control or for ‘Know Your Customer’ applica­tions: an effective automatic face comparison requires a magnification distortion of less than 10%, i.e. a camera-subject distance above 0.5 m. A distortion of 7.1% would be ideal, which corresponds with a camera-subject distance over 0.7 m.

References
1 Berthe, B. (2017). Face comparison & magnification distortion: Can magnification distortion affect the accuracy of automatic facial recognition? Keesing Journal of Documents & Identity, Vol. 53, pp. 18-21.
2 Amos, B., Ludwiczuk, B. and Satyanarayanan, M. (2016). OpenFace: A general-purpose face recognition library with mobile applications. Technical Report CMU-CS-16-118, Carnegie Mellon University – School of Computer Science.
3 ISO/IEC 29794-1 (2016). Information technology – Biometric sample quality – Part 1: Framework. https://www.iso.org/standard/62782.html.
4 ISO/IEC 19795-1 (2006). Information technology – Biometric performance testing and reporting – Part 1: Principles and framework. https://www.iso.org/standard/41447.html .
5 Bolle, R.M., Pankanti, S. and Ratha, N.K. (2000). Evaluation techniques for biometrics-based authentication systems (FRR). In: Proceedings of the 15th International Conference on Pattern Recognition ICPR, Vol. 2, pp. 831-837.