We train an object detector built from convolutional neural networks to count interference fringes in elliptical antinode regions visible in frames of high-speed video recordings of transient oscillations in Caribbean steelpan drums illuminated by electronic speckle pattern interferometry (ESPI). The annotations provided by our model, "SPNet" are intended to contribute to the understanding of time-dependent behavior in such drums by tracking the development of sympathetic vibration modes. The system is trained on a dataset of crowdsourced human-annotated images obtained from the Zooniverse Steelpan Vibrations Project. Due to the relatively small number of human-annotated images, we also train on a large corpus of synthetic images whose visual properties have been matched to those of the real images by using a Generative Adversarial Network to perform style transfer. Applying the model to predict annotations of thousands of unlabeled video frames, we can track features and measure oscillations consistent with audio recordings of the same drum strikes. One surprising result is that the machine-annotated video frames reveal transitions between the first and second harmonics of drum notes that significantly precede such transitions present in the audio recordings. As this paper primarily concerns the development of the model, deeper physical insights await its further application.
翻译:我们训练了一台从进化神经网络中建造的物体探测器,以计算在高速视频记录框架中可见的加勒比钢板钢板圆筒中瞬间振动的高速录像框架所可见的螺旋抗节球区域的干扰边缘。我们的模型“SPNet”提供的说明旨在通过跟踪同情振动模式的发展,帮助了解这些圆桶中的时间性行为。该系统经过培训,了解了从Zooniverseepel Steepan Virbrations项目获得的众源人附加说明的图像数据集。由于带有人注解的图像数量相对较少,我们还对大量合成图像进行了培训,这些图像的视觉特性与真实图像的图像相匹配,其图像的视觉特征是利用一个Geneamination Aversarial网络进行风格转换。我们应用模型来预测数千个未贴标签的视频框架的描述,我们可以跟踪与同一鼓击声波录音相匹配的特征和振荡。一个令人惊讶的结果是,机器附加说明的视频框架揭示了第一个和第二个图像之间的转换过程,其视觉特性与实际图像的深度变化记录正在大大推进。