As spatial audio is enjoying a surge in popularity, data-driven machine learning techniques that have been proven successful in other domains are increasingly used to process head-related transfer function measurements. However, these techniques require much data, whereas the existing datasets are ranging from tens to the low hundreds of datapoints. It therefore becomes attractive to combine multiple of these datasets, although they are measured under different conditions. In this paper, we first establish the common ground between a number of datasets, then we investigate potential pitfalls of mixing datasets. We perform a simple experiment to test the relevance of the remaining differences between datasets when applying machine learning techniques. Finally, we pinpoint the most relevant differences.
翻译:由于空间音频正在受到欢迎,在其他领域已证明成功的数据驱动机器学习技术越来越多地被用于处理与头有关的传输功能测量,然而,这些技术需要大量数据,而现有的数据集则从几十个到几百个低点不等,因此,将多个数据集结合起来具有吸引力,尽管它们是在不同的条件下测量的。在本文中,我们首先确定若干数据集之间的共同点,然后调查混合数据集的潜在陷阱。我们进行了简单的实验,以测试在应用机器学习技术时数据集之间剩余差异的相关性。最后,我们确定了最相关的差异。