Often in language and other areas of cognition, whether two components of an object are identical or not determines if it is well formed. We call such constraints identity effects. When developing a system to learn well-formedness from examples, it is easy enough to build in an identify effect. But can identity effects be learned from the data without explicit guidance? We provide a framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference. We then show that a broad class of learning algorithms including deep feedforward neural networks trained via gradient-based algorithms (such as stochastic gradient descent or the Adam method) satisfy our criteria, dependent on the encoding of inputs. In some broader circumstances we are able to provide of adversarial examples that the network necessarily classifies incorrectly. Finally, we demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.
翻译:通常在语言和其他认知领域, 对象的两个组成部分是否完全相同, 或者没有确定它是否已经形成。 我们称之为限制特性效果。 当开发一个系统从示例中学习完善的特征效果时, 很容易构建一个识别效果。 但是, 身份效果可以从数据中学习而无需明确指导? 我们提供了一个框架, 我们可以在这个框架内严格证明满足简单标准的算法不能得出正确的推理。 然后, 我们展示了广泛的学习算法, 包括由基于梯度的算法( 如随机梯度梯度下移法或亚当法) 培训的深度进料向神经网络, 满足了我们的标准, 取决于投入的编码。 在某些更广泛的情况下, 我们能够提供对抗性例子, 说明网络必然不正确分类 。 最后, 我们用计算实验来展示我们的理论, 我们探索不同输入编码对算法普及新输入的能力的影响。