The piano cover of pop music is widely enjoyed by people. However, the generation task of the pop piano cover is still understudied. This is partly due to the lack of synchronized {Pop, Piano Cover} data pairs, which made it challenging to apply the latest data-intensive deep learning-based methods. To leverage the power of the data-driven approach, we make a large amount of paired and synchronized {pop, piano cover} data using an automated pipeline. In this paper, we present Pop2Piano, a Transformer network that generates piano covers given waveforms of pop music. To the best of our knowledge, this is the first model to directly generate a piano cover from pop audio without melody and chord extraction modules. We show that Pop2Piano trained with our dataset can generate plausible piano covers.
翻译:人们广泛享受流行音乐的钢琴封面。 但是,流行钢琴封面的生成任务仍然没有得到充分研究。 部分原因是缺乏同步的 {Pop, 钢琴封面] 数据配对, 这使得应用最新的数据密集深度学习方法具有挑战性。 为了利用数据驱动方法的力量, 我们用自动管道制作大量对齐同步的 {Pop, 钢琴封面数据。 在本文中, 我们介绍Pop2Piano, 一个生成钢琴封面的变异器网络, 给流行音乐的波形。 据我们所知, 这是第一个直接从流行音乐中生成钢琴封面的模型, 没有旋律和合奏提取模块。 我们显示, Pop2Piano 接受过数据组合培训后, 能够产生合情合理的钢琴封面 。