Automatic detection of rail track and its fasteners via using continuously collected railway images is important to maintenance as it can significantly improve maintenance efficiency and better ensure system safety. Dominant computer vision-based detection models typically rely on convolutional neural networks that utilize local image features and cumbersome prior settings to generate candidate boxes. In this paper, we propose a deep convolutional transformer network based method to detect multi-class rail components including the rail, clip, and bolt. We effectively synergize advantages of the convolutional structure on extracting latent features from raw images as well as advantages of transformers on selectively determining valuable latent features to achieve an efficient and accurate performance on rail component detections. Our proposed method simplifies the detection pipeline by eliminating the need of prior settings, such as anchor box, aspect ratio, default coordinates, and post-processing, such as the threshold for non-maximum suppression; as well as allows users to trade off the quality and complexity of the detector with limited training data. Results of a comprehensive computational study show that our proposed method outperforms a set of existing state-of-art approaches with large margins
翻译:通过使用不断收集的铁路图像自动探测铁路轨道及其紧固器对维护非常重要,因为它能够大大提高维护效率,更好地确保系统安全。基于计算机视像的大型探测模型通常依赖利用当地图像特征和以往繁琐环境生成候选箱的进化神经网络。在本文中,我们提议了一种基于深演变压器网络的深层方法,以探测多级铁路部件,包括铁路、短片和螺栓。我们有效地利用革命结构在从原始图像中提取潜在特征方面的优势以及变压器在有选择地确定宝贵的潜在特征以实现铁路部件探测的效率和准确性能方面的优势,以便实现铁路部件探测的高效和准确性能。我们提议的方法简化了探测管道,消除了先前环境的需要,例如锚箱、侧比、默认坐标和后处理,例如非最大抑制的门槛;以及允许用户用有限的培训数据交换探测器的质量和复杂性。一项全面计算研究的结果显示,我们提出的方法比大边缘的现有状态方法要优。