We apply a large multilingual language model (BLOOM-176B) in open-ended generation of Chinese song lyrics, and evaluate the resulting lyrics for coherence and creativity using human reviewers. We find that current computational metrics for evaluating large language model outputs (MAUVE) have limitations in evaluation of creative writing. We note that the human concept of creativity requires lyrics to be both comprehensible and distinctive -- and that humans assess certain types of machine-generated lyrics to score more highly than real lyrics by popular artists. Inspired by the inherently multimodal nature of album releases, we leverage a Chinese-language stable diffusion model to produce high-quality lyric-guided album art, demonstrating a creative approach for an artist seeking inspiration for an album or single. Finally, we introduce the MojimLyrics dataset, a Chinese-language dataset of popular song lyrics for future research.
翻译:我们运用大型多语种语言模式(BLOOM-176B)在开放的一代中文歌词中应用大型多语种模式(BLOOM-176B),并用人审查员评价由此产生的一致性和创造性歌词。我们发现,当前用于评价大型语言模式产出的计算指标(MAUVE)在评价创造性写作方面有局限性。我们注意到,人类的创造力概念要求歌词既易懂又独特 -- -- 人类评估某些类型的机器产生的歌词比流行艺术家的真歌词得分高。受专辑发行的本性多式特点的启发,我们利用一种中文稳定传播模式制作高质量的语言指导专辑艺术,展示艺术家寻找专辑或单张灵感的创造性方法。最后,我们引入了MojimLyrics数据集,这是用于未来研究的流行歌词的中文数据集。