Even with strong sequence models like Transformers, generating expressive piano performances with long-range musical structures remains challenging. Meanwhile, methods to compose well-structured melodies or lead sheets (melody + chords), i.e., simpler forms of music, gained more success. Observing the above, we devise a two-stage Transformer-based framework that Composes a lead sheet first, and then Embellishes it with accompaniment and expressive touches. Such a factorization also enables pretraining on non-piano data. Our objective and subjective experiments show that Compose & Embellish shrinks the gap in structureness between a current state of the art and real performances by half, and improves other musical aspects such as richness and coherence as well.
翻译:即便有强大的序列模型,比如变形器,产生具有长程音乐结构的声响钢琴表演,也依然具有挑战性。与此同时,构建结构完善的旋律或铅板(melody +和弦)的方法(melody + 和弦),即更简单的音乐形式,也取得了更大的成功。 以上我们设计了一个基于两阶段的变形器框架,首先将铅板组合成一个铅板,然后用相配合和感触来将其化为变形体。这种因子化还有助于对非钢琴数据进行预先培训。 我们的客观和主观实验显示,复合和美容将艺术现状与实际表现之间的结构差距缩小一半,并改进其他音乐方面,如丰富性和一致性。