Deep image-based modeling received lots of attention in recent years, yet the parallel problem of sketch-based modeling has only been briefly studied, often as a potential application. In this work, for the first time, we identify the main differences between sketch and image inputs: (i) style variance, (ii) imprecise perspective, and (iii) sparsity. We discuss why each of these differences can pose a challenge, and even make a certain class of image-based methods inapplicable. We study alternative solutions to address each of the difference. By doing so, we drive out a few important insights: (i) sparsity commonly results in an incorrect prediction of foreground versus background, (ii) diversity of human styles, if not taken into account, can lead to very poor generalization properties, and finally (iii) unless a dedicated sketching interface is used, one can not expect sketches to match a perspective of a fixed viewpoint. Finally, we compare a set of representative deep single-image modeling solutions and show how their performance can be improved to tackle sketch input by taking into consideration the identified critical differences.
翻译:近些年来,深层次的图像建模受到了很多关注,然而,以素描为基础的建模的平行问题只得到简要研究,往往作为一种潜在的应用。在这项工作中,我们第一次确定了素描和图像投入之间的主要差异:(一) 风格差异,(二) 不确定的视角,(三) 宽度。我们讨论了为什么这些差异中每一种差异都构成挑战,甚至使某种基于图像的模型方法无法适用。我们研究了解决每个差异的替代解决方案。我们这样做,我们提出了一些重要的见解:(一) 通常在对地表与背景的预测中得出不正确的结果,(二) 人类风格的多样性,如果不考虑这些差异,可能导致非常差的概括性特征,最后(三) 除非使用专门的素描界面,我们无法期望草图能够与固定观点的视角相匹配。最后,我们比较了一组具有代表性的深度单一图像建模解决方案,并表明其性能如何通过考虑到所查明的关键差异来改进,从而处理素描图的投入。