Temperature is a widely used hyperparameter in various tasks involving neural networks, such as classification or metric learning, whose choice can have a direct impact on the model performance. Most of existing works select its value using hyperparameter optimization methods requiring several runs to find the optimal value. We propose to analyze the impact of temperature on classification tasks by describing a dataset as a set of statistics computed on representations on which we can build a heuristic giving us a default value of temperature. We study the correlation between these extracted statistics and the observed optimal temperatures. This preliminary study on more than a hundred combinations of different datasets and features extractors highlights promising results towards the construction of a general heuristic for temperature.
翻译:在涉及神经网络的各种任务(如分类或计量学习等)中,温度是一种广泛使用的超参数,其选择可对模型性能产生直接影响。大多数现有作品使用超参数优化方法选择其价值,需要若干运行才能找到最佳价值。我们建议通过描述一套数据集来分析温度对分类任务的影响,将数据集描述为一套统计数据,计算出一个图表,我们可以据此构建一种超常温度默认值。我们研究了这些提取的统计数据与观察到的最佳温度之间的相互关系。这份关于100多个不同数据集和特征提取器的组合的初步研究突显了构建温度总体超常值的有希望的结果。