Foundation models offer an exciting new paradigm for constructing models with out-of-the-box embeddings and a few labeled examples. However, it is not clear how to best apply foundation models without labeled data. A potential approach is to fuse foundation models with weak supervision frameworks, which use weak label sources -- pre-trained models, heuristics, crowd-workers -- to construct pseudolabels. The challenge is building a combination that best exploits the signal available in both foundation models and weak sources. We propose Liger, a combination that uses foundation model embeddings to improve two crucial elements of existing weak supervision techniques. First, we produce finer estimates of weak source quality by partitioning the embedding space and learning per-part source accuracies. Second, we improve source coverage by extending source votes in embedding space. Despite the black-box nature of foundation models, we prove results characterizing how our approach improves performance and show that lift scales with the smoothness of label distributions in embedding space. On six benchmark NLP and video tasks, Liger outperforms vanilla weak supervision by 14.1 points, weakly-supervised kNN and adapters by 11.8 points, and kNN and adapters supervised by traditional hand labels by 7.2 points.
翻译:基础模型为构建模型提供了令人振奋的新范例,这些模型包括箱外嵌入和几个贴标签的例子。然而,尚不清楚如何在没有贴标签数据的情况下最好地应用基础模型。一种潜在的办法是将基础模型与薄弱的监管框架结合起来,这些框架使用薄弱的标签源 -- -- 预先培训的模型、疲劳主义、人群工人 -- -- 来构建假标签。挑战在于构建一个能够最好地利用基础模型和薄弱来源中现有信号的组合。我们提议了Liger,这种组合利用基础模型嵌入来改进现有薄弱监督技术的两个关键要素。首先,我们通过分割嵌入空间和学习每个部分源的精细估计来源的缺陷质量。第二,我们通过扩大嵌入空间的源投票来改进源的覆盖范围。尽管基础模型具有黑盒性质,但我们证明我们的方法如何改进了绩效,并展示了在嵌入空间中标签分布的平滑滑的提升尺度。在六个基准 NLP 和视频任务上,Liger用14.1点、薄弱的固化的标签和11点的KNNN和调整了VAN的标签。