This paper studies task adaptive pre-trained model selection, an \emph{underexplored} problem of assessing pre-trained models so that models suitable for the task can be selected from the model zoo without fine-tuning. A pilot work~\cite{nguyen_leep:_2020} addressed the problem in transferring supervised pre-trained models to classification tasks, but it cannot handle emerging unsupervised pre-trained models or regression tasks. In pursuit of a practical assessment method, we propose to estimate the maximum evidence (marginalized likelihood) of labels given features extracted by pre-trained models. The maximum evidence is \emph{less prone to over-fitting} than the likelihood, and its \emph{expensive computation can be dramatically reduced} by our carefully designed algorithm. The Logarithm of Maximum Evidence (LogME) can be used to assess pre-trained models for transfer learning: a pre-trained model with high LogME is likely to have good transfer performance. LogME is fast, accurate, and general, characterizing it as \emph{the first practical assessment method for transfer learning}. Compared to brute-force fine-tuning, LogME brings over $3000\times$ speedup in wall-clock time. It outperforms prior methods by a large margin in their setting and is applicable to new settings that prior methods cannot deal with. It is general enough to diverse pre-trained models (supervised pre-trained and unsupervised pre-trained), downstream tasks (classification and regression), and modalities (vision and language). Code is at \url{https://github.com/thuml/LogME}.
翻译:本文研究任务 适应性 培训前 模式选择, 是一个评估培训前 模式 的问题, 以便从模型动物园中选择适合该任务的模式, 不做微调 。 实验性工作 {cite{nguyen_ list:_ 2020} 解决了将受监督的训练前模式转移到分类任务中的问题, 但是它无法处理未受监督的训练前模式或回归任务 。 在采用实用的评估方法时, 我们提议估计 由培训前 模式提取的标签的特性的最大证据( 最低可能性 ) 。 最大证据是 \ emph{ 不易过装配的模型 。 最大证据 可以通过我们精心设计的算法 将受监督的训练前 模式( LogME ) 用于评估前 学习模式 : 具有高 LogME 的经过培训前模式可能具有良好的传输性能 。 ( LogME ) 快速、 准确性和一般, 将它定性为不精细的等级 = 第一实际性 模式, 无法 过度的版本, 校正定义 。 在 之前 校验前 的 校验前 方法中, 校正 。