Deep learning using Convolutional Neural Networks (CNNs) has been shown to significantly out-performed many conventional vision algorithms. Despite efforts to increase the CNN efficiency both algorithmically and with specialized hardware, deep learning remains difficult to deploy in resource-constrained environments. In this paper, we propose an end-to-end framework to explore optically compute the CNNs in free-space, much like a computational camera. Compared to existing free-space optics-based approaches which are limited to processing single-channel (i.e., grayscale) inputs, we propose the first general approach, based on nanoscale meta-surface optics, that can process RGB data directly from the natural scenes. Our system achieves up to an order of magnitude energy saving, simplifies the sensor design, all the while sacrificing little network accuracy.
翻译:利用进化神经网络(CNNs)进行深层次学习已经证明大大超过许多常规视觉算法的绩效。 尽管努力提高CNN的算法和专用硬件效率,但深层次学习仍难以在资源受限制的环境中部署。在本文中,我们提议了一个端到端框架,在自由空间对CNN进行光学计算,这类似于计算相机。与限于处理单通道(灰度)投入的现有自由空间光学方法相比,我们提出了第一个通用方法,该方法以纳米级的超表层光学为基础,可以直接从自然场中处理RGB数据。我们的系统达到了一个能节能级级,简化了传感器设计,同时牺牲了网络的微小精度。