Despite exciting progress in large-scale language generation, the expressiveness of its representations is severely limited by the \textit{anisotropy} issue where the hidden representations are distributed into a narrow cone in the vector space. To address this issue, we present ContraGen, a novel contrastive learning framework to improve the representation with better uniformity and discrimination. We assess ContraGen on a wide range of downstream tasks in natural and programming languages. We show that ContraGen can effectively enhance both uniformity and discrimination of the representations and lead to the desired improvement on various language understanding tasks where discriminative representations are crucial for attaining good performance. Specifically, we attain $44\%$ relative improvement on the Semantic Textual Similarity tasks and $34\%$ on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraGen also boosts the source code generation capability with $9\%$ relative improvement on execution accuracy on the HumanEval benchmark.
翻译:尽管在大规模语言生成方面取得了令人振奋的进展,但其表达方式的清晰度却受到以下问题的严格限制:隐藏的表达方式分布在矢量空间的狭窄锥体中。为了解决这一问题,我们提出了一种全新的对比式学习框架,即ContraGen,这是一个以更统一和更加歧视的方式改善代表性的新颖的对比式学习框架。我们评估了在自然语言和编程语言中一系列广泛的下游任务中的对比式Gen。我们表明, ContraGen可以有效地提高表达方式的统一性和区别性,并导致各种语言理解性任务的预期改进,而歧视性表述对于取得良好业绩至关重要。具体地说,我们实现了对文体相似性任务的相对改善44美元,对守则搜索任务的相对改进34美元。此外,通过提高表述的清晰度,ContraGen还可以提高源代码生成能力,在人类价值评估基准的执行精确度上相对提高9美元。