Observing and recognizing materials is a fundamental part of our daily life. Under typical viewing conditions, we are capable of effortlessly identifying the objects that surround us and recognizing the materials they are made of. Nevertheless, understanding the underlying perceptual processes that take place to accurately discern the visual properties of an object is a long-standing problem. In this work, we perform a comprehensive and systematic analysis of how the interplay of geometry, illumination, and their spatial frequencies affects human performance on material recognition tasks. We carry out large-scale behavioral experiments where participants are asked to recognize different reference materials among a pool of candidate samples. In the different experiments, we carefully sample the information in the frequency domain of the stimuli. From our analysis, we find significant first-order interactions between the geometry and the illumination, of both the reference and the candidates. In addition, we observe that simple image statistics and higher-order image histograms do not correlate with human performance. Therefore, we perform a high-level comparison of highly non-linear statistics by training a deep neural network on material recognition tasks. Our results show that such models can accurately classify materials, which suggests that they are capable of defining a meaningful representation of material appearance from labeled proximal image data. Last, we find preliminary evidence that these highly non-linear models and humans may use similar high-level factors for material recognition tasks.
翻译:观测和识别材料是我们日常生活的一个基本部分。 在典型的观察条件下, 我们有能力不遗余力地辨别周围的物体, 并辨别它们所制作的材料。 然而, 了解准确辨别物体视觉特性的基本概念过程是一个长期问题。 在这项工作中, 我们全面、 系统地分析几何、 光化及其空间频率的相互作用如何影响人类在物质识别任务方面的表现。 我们进行了大规模的行为实验, 要求参与者在候选样本库中识别不同的参考材料。 在不同的实验中, 我们仔细抽样研究刺激物的频率领域的信息。 我们通过分析发现, 参考物学和候选物学的测深层次和亮度之间, 存在重要的第一阶互动。 此外, 我们观察到, 简单的图像统计和高阶图像直方图像与人类性能不相干。 因此, 我们通过对高度非线性统计进行高层次的比较, 通过训练一个深层的神经识别网络进行材料识别任务。 我们的实验结果显示, 这种模型可以精确地在刺激的表面图像上进行分类, 我们从这样的模型可以精确地分析。