In music source separation, the number of sources may vary for each piece and some of the sources may belong to the same family of instruments, thus sharing timbral characteristics and making the sources more correlated. This leads to additional challenges in the source separation problem. This paper proposes a source separation method for multiple musical instruments sounding simultaneously and explores how much additional information apart from the audio stream can lift the quality of source separation. We explore conditioning techniques at different levels of a primary source separation network and utilize two extra modalities of data, namely presence or absence of instruments in the mixture, and the corresponding video stream data.
翻译:在音乐来源分离方面,每种音乐来源的来源数量可能各不相同,一些来源可能属于同一组仪器,从而分享小字节特性,使来源更加相关,从而在来源分离问题上带来更多的挑战。本文件提议了一种源分离方法,用于同时探测多种乐器,并探讨除了音频流之外,其他多少额外信息可以提高源分离的质量。我们探索初级源分离网络不同层次的调节技术,并使用两种额外数据模式,即混合中是否存在仪器和相应的视频流数据。