Biometric Entropy, Epigenetics, and Iris Recognition

Prof. John Daugman
Cambridge University, UK

The key property of biometric data that provides resistance against False Matches is its entropy, which reflects both the number of discriminable states (or patterns) that may arise and how broad is the probability distribution across these. Biometric modalities having higher entropy generate templates with greater uniqueness, for essentially the same reason that longer cryptographic keys are stronger. This is especially important for any biometric modality having ambitions to be used for identification, rather than mere 1-to-1 verification. For example, a weak biometric technology like face recognition may be stretched to perform well at a 1-to-1 False Match rate of 1 in 1,000 on mugshot or studio quality frontal face images, but that same performance level implies that it becomes more likely than not to make biometric collisions (False Matches in cross-comparisons) once there are at least 38 face images in the database. Further key questions for any biometric technology are: whether or not it generates templates that are roughly equidistant between different persons; and if it has a universal Impostors Distribution. One determinant of these is epigenetics: whether genetic relatedness confers biometric similarity. Typically monozygotic twins (sharing 100% of their genes) have almost indistinguishable faces; but this can also be true for persons sharing just 50% of their genes; and it is not difficult to find "doppelganger" pairs among persons who are completely unrelated. A remarkable property of the IrisCode for iris recognition is that all different persons (indeed eyes) are roughly equidistant from each other in dissimilarity. We generated 316,250 entire distributions of IrisCode impostor scores, each distribution obtained by comparing one iris against hundreds of thousands of others in a database including persons spanning 152 nationalities. Altogether 100 Billion iris comparisons were done. These 316,250 distributions had an extremely narrow distribution of means (0.453 - 0.456 IQ), and of standard deviations (0.020 - 0.022 IQ). The importance of having such a nearly universal Impostors Distribution is that: (i) a given dissimilarity score threshold can be translated immediately into a False Match probability and a confidence level; (ii) it allows straightforward extrapolation from 1-to-1 False Match probability (given some score) to the NET False Match probability after a 1-to-many identification search; (iii) the number of alternative comparisons that are made before a given best match is encountered can be factored into its interpretation; and finally, (iv) if a given biometric modality cannot assume a universal Impostors Distribution, then any observed similarity score must be further qualified by whether a subject is the type of person who has many doppelgangers, or few. These and other issues will be discussed with particular focus on iris recognition. Its resistance to False Matches is now legendary but was initially dismissed as a claim in need of debunking, until independent tests at NIST confirmed, on separate databases enabling a trillion IrisCode cross-comparisons, nearly the very same table of False Match probabilities that the lecturer had published years earlier before those databases existed. The roles of these key properties will be illustrated with the Indian Government UIDAI enrollment now of a billion citizens, involving full cross-comparisons for de-duplication checks. Time allowing, the lecturer will also discuss a Markov generative model for the random texture in the iris, and image quality metrics.



European Commission

EU Horizon 2020



Technical Committee on Biometrics (TC4)



European Association for Signal Processing



Morpho - Safran group


EAB European Association for Biometrics


Biometrics Institute




University of Sassari