LogME: 对培训前转让学习模式的实际评估 (LogME: Practical Assessment of Pre-trained Models for Transfer Learning)

This paper studies task adaptive pre-trained model selection, an \emph{underexplored} problem of assessing pre-trained models so that models suitable for the task can be selected from the model zoo without fine-tuning. A pilot work~\cite{nguyen_leep:_2020} addressed the problem in transferring supervised pre-trained models to classification tasks, but it cannot handle emerging unsupervised pre-trained models or regression tasks. In pursuit of a practical assessment method, we propose to estimate the maximum evidence (marginalized likelihood) of labels given features extracted by pre-trained models. The maximum evidence is \emph{less prone to over-fitting} than the likelihood, and its \emph{expensive computation can be dramatically reduced} by our carefully designed algorithm. The Logarithm of Maximum Evidence (LogME) can be used to assess pre-trained models for transfer learning: a pre-trained model with high LogME is likely to have good transfer performance. LogME is fast, accurate, and general, characterizing it as \emph{the first practical assessment method for transfer learning}. Compared to brute-force fine-tuning, LogME brings over $3000\times$ speedup in wall-clock time. It outperforms prior methods by a large margin in their setting and is applicable to new settings that prior methods cannot deal with. It is general enough to diverse pre-trained models (supervised pre-trained and unsupervised pre-trained), downstream tasks (classification and regression), and modalities (vision and language). Code is at \url{https://github.com/thuml/LogME}.

翻译：本文研究任务适应性培训前模式选择, 是一个评估培训前模式的问题, 以便从模型动物园中选择适合该任务的模式, 不做微调。实验性工作 {cite{nguyen_ list:_ 2020} 解决了将受监督的训练前模式转移到分类任务中的问题, 但是它无法处理未受监督的训练前模式或回归任务。在采用实用的评估方法时, 我们提议估计由培训前模式提取的标签的特性的最大证据( 最低可能性 ) 。最大证据是 \ emph{ 不易过装配的模型。最大证据可以通过我们精心设计的算法将受监督的训练前模式( LogME ) 用于评估前学习模式 : 具有高 LogME 的经过培训前模式可能具有良好的传输性能。 ( LogME ) 快速、准确性和一般, 将它定性为不精细的等级 = 第一实际性模式, 无法过度的版本, 校正定义。在之前校验前的校验前方法中, 校正。