With the development of deep neural language models, great progress has been made in information extraction recently. However, deep learning models often overfit on noisy data points, leading to poor performance. In this work, we examine the role of information entropy in the overfitting process and draw a key insight that overfitting is a process of overconfidence and entropy decreasing. Motivated by such properties, we propose a simple yet effective co-regularization joint-training framework TIER-A, Aggregation Joint-training Framework with Temperature Calibration and Information Entropy Regularization. Our framework consists of several neural models with identical structures. These models are jointly trained and we avoid overfitting by introducing temperature and information entropy regularization. Extensive experiments on two widely-used but noisy datasets, TACRED and CoNLL03, demonstrate the correctness of our assumption and the effectiveness of our framework.
翻译:最近随着深层神经语言模型的开发,在信息提取方面取得了巨大进展,然而,深层次的学习模型往往过度适应吵闹的数据点,导致业绩不佳。在这项工作中,我们研究了信息在超装过程中的渗透作用,并得出了一种关键的认识,即过度适应是一个过度自信和增生的过程。由于这些特性,我们提议了一个简单而有效的共同正规化联合培训框架TIER-A,与温度校准和信息正规化相结合的联合培训框架。我们的框架由若干具有相同结构的神经模型组成。这些模型经过联合培训,我们避免通过引入温度和信息的正规化而过度适应。关于两个广泛使用但又吵闹的数据集TACRED和CONLL03的大规模实验,显示了我们假设的正确性和框架的有效性。