Recent advances in deep learning models for sequence classification have greatly improved their classification accuracy, specially when large training sets are available. However, several works have suggested that under some settings the predictions made by these models are poorly calibrated. In this work we study binary sequence classification problems and we look at model calibration from a different perspective by asking the question: Are deep learning models capable of learning the underlying target class distribution? We focus on sparse sequence classification, that is problems in which the target class is rare and compare three deep learning sequence classification models. We develop an evaluation that measures how well a classifier is learning the target class distribution. In addition, our evaluation disentangles good performance achieved by mere compression of the training sequences versus performance achieved by proper model generalization. Our results suggest that in this binary setting the deep-learning models are indeed able to learn the underlying class distribution in a non-trivial manner, i.e. by proper generalization beyond data compression.
翻译:最近关于序列分类的深层次学习模型的进展大大提高了分类准确性,特别是在有大型培训组合的情况下。然而,一些工作表明,在某些环境下,这些模型所作的预测没有经过很好的校准。在这项工作中,我们研究了二进制序列分类问题,我们从不同的角度研究模型校准问题,我们提出问题:深层次学习模型是否能够学习基本目标类别分布?我们侧重于稀疏的序列分类,这就是目标类别很少的问题,比较了三个深层次学习序列分类模型。我们进行了一项评估,以衡量分类者如何很好地学习目标类别分布。此外,我们的评估混淆了仅仅通过压缩培训序列而取得的良好业绩与通过适当模型概括实现的绩效。我们的结果表明,在这种二进制中,深层次学习模型确实能够以非三进式的方式学习基本的班级分布,即通过适当的统化超越数据压缩来学习。