Shoe tread impressions are one of the most common types of evidence left at crime scenes. However, the utility of such evidence is limited by the lack of databases of footwear impression patterns that cover the huge and growing number of distinct shoe models. We propose to address this gap by leveraging shoe tread photographs collected by online retailers. The core challenge is to predict the impression pattern from the shoe photograph since ground-truth impressions or 3D shapes of tread patterns are not available. We develop a model that performs intrinsic image decomposition (predicting depth, normal, albedo, and lighting) from a single tread photo. Our approach, which we term ShoeRinsics, combines domain adaptation and re-rendering losses in order to leverage a mix of fully supervised synthetic data and unsupervised retail image data. To validate model performance, we also collected a set of paired shoe-sole images and corresponding prints, and define a benchmarking protocol to quantify the accuracy of predicted impressions. On this benchmark, ShoeRinsics outperforms existing methods for depth prediction and synthetic-to-real domain adaptation.
翻译:鞋印印象是犯罪现场留下的最常见证据类型之一,但是,由于缺乏涵盖数量庞大和越来越多的不同鞋型的鞋印模式数据库,这种证据的效用受到限制。我们建议利用在线零售商收集的鞋印照片来弥补这一差距。核心挑战是预测鞋印的印象模式,因为没有地面真实印象或3D形的脚印模式。我们开发了一种模型,从一张脚印照片中产生内在图像分解(预示深度、正常深度、超常度和照明)。我们的方法是ShoeRinsics,将域适应和重现损失结合起来,以便利用充分监督的合成数据和未经监督的零售图像数据组合。为了验证模型性能,我们还收集了一套配对鞋的图像和相应的指纹,并制定了一套基准协议,以量化预测印象的准确性。关于这一基准,ShoeRinsics超越了现有的深度预测和合成到现实域适应方法。