In traditional software programs, we take for granted how easy it is to debug code by tracing program logic from variables back to input, apply unit tests and assertion statements to block erroneous behavior, and compose programs together. But as the programs we write grow more complex, it becomes hard to apply traditional software to applications like computer vision or natural language. Although deep learning programs have demonstrated strong performance on these applications, they sacrifice many of the functionalities of traditional software programs. In this paper, we work towards bridging the benefits of traditional and deep learning programs by jointly training a generative model to constrain neural network activations to "decode" back to inputs. Doing so enables practitioners to probe and track information encoded in activation(s), apply assertion-like constraints on what information is encoded in an activation, and compose separate neural networks together in a plug-and-play fashion. In our experiments, we demonstrate applications of decodable representations to out-of-distribution detection, adversarial examples, calibration, and fairness -- while matching standard neural networks in accuracy.
翻译:在传统软件程序中,我们理所当然地认为通过追踪程序逻辑从变量回溯到输入,应用单位测试和声明来阻止错误行为,并共同制作程序来调试代码是多么容易。但是,随着我们写的程序越来越复杂,很难将传统软件应用到计算机视觉或自然语言等应用中。虽然深层次的学习程序在这些应用上表现很强,但它们牺牲了传统软件程序的许多功能。在本文中,我们通过联合培训一种基因模型来弥合传统和深层次学习方案的好处,以限制神经网络的激活,将神经网络的启动“解码”回溯到输入。这样做可以使从业人员探测和跟踪在激活中编码的信息,对在激活中编码的信息应用类似断言的限制,并以插接和播放的方式将不同的神经网络组合在一起。在我们的实验中,我们展示了在超分配检测、对抗示例、校正和公平性方面可破解的表达方式的应用,同时精确地匹配标准的神经网络。