Assessment of voice signals has long been performed with the assumption of periodicity as this facilitates analysis. Near periodicity of normal voice signals makes short-time harmonic modeling an appealing choice to extract vocal feature parameters. For dysphonic voice, however, a fixed harmonic structure could be too constrained as it strictly enforces periodicity in the model. Slight variation in amplitude or frequency in the signal may cause the model to misrepresent the observed signal. To address these issues, this paper presents a time-varying harmonic model, which allows its fundamental frequency and harmonic amplitudes to be polynomial functions of time. The model decouples the slow deviations of frequency and amplitude from fast irregular vocal fold vibratory behaviors such as subharmonics and diplophonia. The time-varying model is shown to track the frequency and amplitude modulations present in voice with severe tremor. This reduces the sensitivity of the model-based harmonics-to-noise ratio measures to slow frequency and amplitude variations while maintaining its sensitivity to increase in turbulent noise or the presence of irregular vibration. Other uses of the model include the vocal tract filter estimation and the rates of frequency and intensity changes. These use cases are experimentally demonstrated along with the modeling accuracy.
翻译:对声音信号的评估长期以来一直以周期性为假设进行,因为这样便于分析。正常声音信号的周期性近于使短时的调音模型成为吸引性选择以提取声音特征参数的诱人模型。但是,对于听音声音来说,固定的调音结构可能太受限制,因为它严格强制模型的周期性。信号的振幅或频率的微变可能会导致模型误传所观察到的信号。为了解决这些问题,本文件展示了一个时间变化的调音模型,允许其基本频率和调音振动为时间的多元功能。模型将频率和振动的缓慢偏差与快速声动振动动作的缓慢偏差分解开来,例如亚调和调频调和调振动。显示时间变化模型可以跟踪以严重震动声音出现的频率和振动调调调调模式。这降低了模型调和振动比测量的敏感度,同时保持其对不稳定噪音的敏感度或扰动变化的敏感度或振动速度的振动速度和振动速度变化。其他模型使用的频率和频率变率是按频率变的频率和变压率计算的。