Face recognition technology has advanced significantly in recent years due largely to the availability of large and increasingly complex training datasets for use in deep learning models. These datasets, however, typically comprise images scraped from news sites or social media platforms and, therefore, have limited utility in more advanced security, forensics, and military applications. These applications require lower resolution, longer ranges, and elevated viewpoints. To meet these critical needs, we collected and curated the first and second subsets of a large multi-modal biometric dataset designed for use in the research and development (R&D) of biometric recognition technologies under extremely challenging conditions. Thus far, the dataset includes more than 350,000 still images and over 1,300 hours of video footage of approximately 1,000 subjects. To collect this data, we used Nikon DSLR cameras, a variety of commercial surveillance cameras, specialized long-rage R&D cameras, and Group 1 and Group 2 UAV platforms. The goal is to support the development of algorithms capable of accurately recognizing people at ranges up to 1,000 m and from high angles of elevation. These advances will include improvements to the state of the art in face recognition and will support new research in the area of whole-body recognition using methods based on gait and anthropometry. This paper describes methods used to collect and curate the dataset, and the dataset's characteristics at the current stage.
翻译:近年来,由于在深层学习模式中提供了大量日益复杂的培训数据集,这些数据集通常包括新闻网站或社交媒体平台上的图像,因此在更先进的安全、法证和军事应用方面用处有限,这些应用需要较低的分辨率、更长的射程和更高的观点。为了满足这些关键需求,我们收集和整理了用于在极具挑战性的条件下研究和开发生物鉴别技术的生物鉴别技术的大型多模式生物鉴别数据集的第一和第二组。迄今为止,数据集包括350 000多张仍然图像和1 300多小时约1 000个主题的视频镜头。为了收集这些数据,我们使用了Nikon DSLR摄像机、各种商业监控摄像机、专门的长程R&D摄像机以及第1组和2组UAV平台。目的是支持制定能够准确识别1 000米以内和高端人群的人的算法。这些进步将包括改进目前阶段的艺术特征和1 300多小时的视频镜头。我们将利用Nikon DSLL摄像机、各种商业监控摄像机、专门长程R&D摄像头、第1组和第2组UAV平台。将支持发展能够准确认识1 000米和高高度的人的方位的人的方位的算法。这些进展将包括改进了脸识别和用于整个数据库数据库的新的数据收集方法。