Recent research efforts in optical computing have gravitated towards developing optical neural networks that aim to benefit from the processing speed and parallelism of optics/photonics in machine learning applications. Among these endeavors, Diffractive Deep Neural Networks (D2NNs) harness light-matter interaction over a series of trainable surfaces, designed using deep learning, to compute a desired statistical inference task as the light waves propagate from the input plane to the output field-of-view. Although, earlier studies have demonstrated the generalization capability of diffractive optical networks to unseen data, achieving e.g., >98% image classification accuracy for handwritten digits, these previous designs are in general sensitive to the spatial scaling, translation and rotation of the input objects. Here, we demonstrate a new training strategy for diffractive networks that introduces input object translation, rotation and/or scaling during the training phase as uniformly distributed random variables to build resilience in their blind inference performance against such object transformations. This training strategy successfully guides the evolution of the diffractive optical network design towards a solution that is scale-, shift- and rotation-invariant, which is especially important and useful for dynamic machine vision applications in e.g., autonomous cars, in-vivo imaging of biomedical specimen, among others.
翻译:光学计算方面的近期研究努力已转向发展光神经网络,目的是从机器学习应用中的光学/光学/光学光学的处理速度和平行性中获益。在这些努力中,Diffractive Deep神经网络(D2NNS)利用一系列可训练表面的光物质互动,设计时使用了深层次的学习,计算出一个理想的统计推论任务,因为光波从输入平面向输出领域传播,而光波从输入平面向输出领域传播时,计算出一个理想的统计推论任务。虽然早先的研究已经表明,diffrent光网络向无形数据普及的能力,例如,达到>98%的手写数字图像分类精度,这些先前的设计总体上对输入对象的空间缩放、翻译和旋转具有敏感性。在这里,我们展示了一个新的对调动网络的培训战略,这些网络在培训阶段引入输入对象翻译、旋转和/或缩放,作为统一分布随机变量,以建立其盲推断性性功能,以适应这种物体的变异。这一培训战略成功地指导了 diffent光网络设计设计设计设计向一个解决方案的演变过程,在移动、机图像像像像像化应用中特别重要。