音频指纹技术中的谱图：峰值检测算法 (SpectroMap: Peak detection algorithm for audio fingerprinting)

Audio fingerprinting is a technique used to identify and match audio recordings based on their unique characteristics. It involves creating a condensed representation of an audio signal that can be used to quickly compare and match against other audio recordings. The fingerprinting process involves analyzing the audio signal to extract certain features, such as spectral content, tempo, and rhythm, among other things. In this paper, we present SpectroMap, an open-source GitHub repository for audio fingerprinting written in Python programming language. It is composed of a peak search algorithm that extracts topological prominences from a spectrogram via time-frequency bands. In this paper, we introduce the algorithm functioning with two experimental applications in a high-quality urban sound dataset and environmental audio recordings to describe how it works and how effective it is in handling the input data. Finally, we have posed two Python scripts that would reproduce the proposed case studies in order to ease the reproducibility of our audio fingerprinting system.

翻译：音频指纹是一种根据唯一特征来识别和匹配音频录音的技术。它涉及创建音频信号的压缩表示，该表示可以用于快速比较和匹配其他音频录音。指纹技术包括分析音频信号以提取某些特征，例如谱内容、节奏和节拍等。本文提出的 SpectroMap 是一种用 Python 编写的开源 GitHub 存储库，用于音频指纹。它由一种通过时频带从谱图中提取拓扑显著性的峰值搜索算法组成。在本文中，我们介绍了算法功能和两个实验应用，一是针对高质量城市声音数据集的应用，二是针对环境音频录音的应用，以描述它的工作原理及处理输入数据的有效性。最后，我们提供了两个 Python 脚本，共同辅助实现所提出的案例研究，以便易于实现音频指纹系统的可重现性。