以 1D 操作GANs 恢复真实世界音频 (Blind Restoration of Real-World Audio by 1D Operational GANs)

Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.

翻译：目标:尽管为文献中的音频恢复建议了许多研究,但大多数研究侧重于孤立的恢复问题,如贬低或剥离音频,忽略其他文物。此外,假设一个噪音或回响环境,固定信号对扭曲比率(SDR)的比重有限,这是一个常见的做法。然而,现实世界的音频往往被各种艺术品混合混在一起而腐蚀,如回响、感应噪音和背景音频混合,有不同类型、偏差和持续时间。在这项研究中,我们提出了一种新颖的方法,用操作性创性Adversarial网络(Op-GANs)盲目的真实世界音频信号盲目的恢复,同时使用时间和光性环境环境环境环境,同时使用1DO型操作-GANs(SDR)的校正性信号质量,同时使用Oral-RARRR-RRR(On-IFR)的比重(Oral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-leval-ral-ral-ral-al-ral-al-de-de-lation)的比重的比重性能的比重,在实时研究中,在实时的比数-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-ral-ral-ral-ral-al-al-ral-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-s-s-al-al-al-al-al-al-al-al-al-ral-ld-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-al-