Recurrent neural networks are deep learning topologies that can be trained to classify long documents. However, in our recent work, we found a critical problem with these cells: they can use the length differences between texts of different classes as a prominent classification feature. This has the effect of producing models that are brittle and fragile to concept drift, can provide misleading performances and are trivially explainable regardless of text content. This paper illustrates the problem using synthetic and real-world data and provides a simple solution using weight decay regularization.
翻译:经常性神经网络是深层的学习结构,可以用来对长文件进行分类。然而,在最近的工作中,我们发现这些细胞存在一个严重的问题:它们可以使用不同类别文本之间的长度差异作为突出的分类特征。这具有产生模型的效果,这些模型易变易变,容易产生概念漂移的概念,可以提供误导性性表现,而且无论文字内容如何,都是微不足道的解释。本文用合成和真实世界的数据来说明问题,并用重量衰减规范提供简单的解决办法。