Download:

Large-scale biometric data collection, management and evaluation

Prof. Patrik J. Flynn
University of Notre Dame

Evaluations of biometric identification technology are increasingly prevalent as broad deployments are contemplated and executed by government and industry.

While the application scenario typically dictates the sort of experiment(s) to be used in the evaluation, the design and
execution of an appropriate data collection strategy has many inherent choices and constraints. The need for statistically reliable estimates of error rates often provide large lower bounds on the amount of data to be collected, and the imbalance between the proportion of false matches and true matches is tremendous in any large-scale experimental data set. This session will provide some historical perspective on data set sizes used in biometric ID experiments, and discuss at some length a four-year biometric data collection effort underway at Notre Dame, that has supported three government biometrics programs, generated terabytes of raw data, and consumed thousands of person-hours of effort from dozens of students. Special attention will be given to the barriers to efficient management, annotation, and postprocessing of large data sets.

with support from