Large language models (LLMs) effectively generate fluent text when the target output follows natural language patterns. However, structured prediction tasks confine the output format to a limited ontology, causing even very large models to struggle since they were never trained with such restrictions in mind. The difficulty of using LLMs for direct prediction is exacerbated in few-shot learning scenarios, which commonly arise due to domain shift and resource limitations. We flip the problem on its head by leveraging the LLM as a tool for data augmentation rather than direct prediction. Our proposed Mixture of Soft Prompts (MSP) serves as a parameter-efficient procedure for generating data in a controlled manner. Denoising mechanisms are further applied to improve the quality of synthesized data. Automatic metrics show our method is capable of producing diverse and natural text, while preserving label semantics. Moreover, MSP achieves state-of-the-art results on three benchmarks when compared against strong baselines. Our method offers an alternate data-centric approach for applying LLMs to complex prediction tasks.
翻译:大型语言模型(LLMS)在目标输出遵循自然语言模式时有效产生流畅的文字。然而,结构化预测任务将输出格式局限于有限的本体学,造成甚至非常大的模型挣扎,因为从未经过过这种限制的训练。在少数的学习情景中,使用LLMS直接预测的困难就更加严重了,这种困难通常产生于领域转移和资源限制。我们利用LLM作为数据增强工具而不是直接预测工具,从而将问题置于其头上。我们提议的软提示混合(MSP)作为以受控方式生成数据的参数效率程序。Denoising机制被进一步用于提高合成数据的质量。自动指标显示我们的方法能够产生多样化的自然文本,同时保存标签的语义学。此外,MSP在与强势基线相比,在三个基准上取得了最新的结果。我们的方法提供了一种以数据为中心的替代方法,将LMMS应用于复杂的预测任务。</s>