Audio signals are often stored and transmitted in compressed formats. Among the many available audio compression schemes, MPEG-1 Audio Layer III (MP3) is very popular and widely used. Since MP3 is lossy it leaves characteristic traces in the compressed audio which can be used forensically to expose the past history of an audio file. In this paper, we consider the scenario of audio signal manipulation done by temporal splicing of compressed and uncompressed audio signals. We propose a method to find the temporal location of the splices based on transformer networks. Our method identifies which temporal portions of a audio signal have undergone single or multiple compression at the temporal frame level, which is the smallest temporal unit of MP3 compression. We tested our method on a dataset of 486,743 MP3 audio clips. Our method achieved higher performance and demonstrated robustness with respect to different MP3 data when compared with existing methods.
翻译:音频信号通常以压缩格式存储和传输。 在多种可用的音频压缩办法中, MPEG-1 音频层III (MP3) 是非常受欢迎和广泛使用的。 由于 MP3 失传, 它在压缩音频中留下特殊痕迹, 可以法医地用于揭露音频文件的过去历史。 在本文中, 我们考虑音频信号操纵的情景, 即通过压缩和未压缩音频信号的时间复制。 我们提出一种方法, 以变压器网络为基础寻找断流器的时间位置 。 我们的方法确定音频信号的哪个时间段在时间框架水平上经历了一次或多次压缩, 这是 MP3 压缩最小的时间单位 。 我们在486, 743 MP3 音频剪数据集上测试了我们的方法。 我们的方法取得了更高的性能, 并显示与现有方法相比, 不同的 MP3 数据具有很强性能 。