Effective training of deep neural networks can be challenging, and there remain many open questions on how to best learn these models. Recently developed methods to improve neural network training examine teaching: providing learned information during the training process to improve downstream model performance. In this paper, we take steps towards extending the scope of teaching. We propose a flexible teaching framework using commentaries, learned meta-information helpful for training on a particular task. We present gradient-based methods to learn commentaries, leveraging recent work on implicit differentiation for scalability. We explore diverse applications of commentaries, from weighting training examples, to parameterising label-dependent data augmentation policies, to representing attention masks that highlight salient image regions. We find that commentaries can improve training speed and/or performance, and provide insights about the dataset and training process. We also observe that commentaries generalise: they can be reused when training new models to obtain performance benefits, suggesting a use-case where commentaries are stored with a dataset and leveraged in future for improved model training.
翻译:对深层神经网络的有效培训可能具有挑战性,对于如何最好地学习这些模型,还存在许多尚未解决的问题。最近开发的改进神经网络培训的方法检查教学:在培训过程中提供学到的信息,以改进下游模型的绩效。在本文件中,我们采取步骤扩大教学范围。我们提出一个灵活的教学框架,使用评注,学习有助于特定任务培训的元信息。我们提出基于梯度的方法来学习评论,利用最近关于隐性差异的工作促进可扩缩性。我们探索评注的各种应用,从加权培训实例到参照标签的数据增强政策的参数化,到显示突出突出图像区域的引力面罩。我们发现,评注可以提高培训速度和/或性能,并对数据集和培训过程提供洞察力。我们还注意到,评注概括如下:在培训新模型以获得绩效收益时,可以重新使用这些方法,建议用一个使用案例,将评注储存在数据集中,并在今后用于改进模型培训。