Rail detection, essential for railroad anomaly detection, aims to identify the railroad region in video frames. Although various studies on rail detection exist, neither an open benchmark nor a high-speed network is available in the community, making algorithm comparison and development difficult. Inspired by the growth of lane detection, we propose a rail database and a row-based rail detection method. In detail, we make several contributions: (i) We present a real-world railway dataset, Rail-DB, with 7432 pairs of images and annotations. The images are collected from different situations in lighting, road structures, and views. The rails are labeled with polylines, and the images are categorized into nine scenes. The Rail-DB is expected to facilitate the improvement of rail detection algorithms. (ii) We present an efficient row-based rail detection method, Rail-Net, containing a lightweight convolutional backbone and an anchor classifier. Specifically, we formulate the process of rail detection as a row-based selecting problem. This strategy reduces the computational cost compared to alternative segmentation methods. (iii) We evaluate the Rail-Net on Rail-DB with extensive experiments, including cross-scene settings and network backbones ranging from ResNet to Vision Transformers. Our method achieves promising performance in terms of both speed and accuracy. Notably, a lightweight version could achieve 92.77% accuracy and 312 frames per second. The Rail-Net outperforms the traditional method by 50.65% and the segmentation one by 5.86%. The database and code are available at: https://github.com/Sampson-Lee/Rail-Detection.
翻译:铁路检测是铁路异常检测的关键,旨在在视频帧中识别铁路区域。尽管存在各种关于铁路检测的研究,但在社区中既没有公开的基准数据集,也没有高速网络,这使算法比较和开发变得困难。受到车道检测的发展启发,我们提出了一个铁路数据库和基于行的铁路检测方法。具体而言,我们做出了几点贡献:(i) 我们提出了一个真实的铁路数据集 Rail-DB,其中包含 7432 对图像和注释。这些图像是从不同的光照、道路结构和视角等情况下收集的。铁路带被标记为折线,图像被分为九个场景。Rail-DB 有望有助于改进铁路检测算法。(ii) 我们提出了一种高效的基于行的铁路检测方法 Rail-Net,包括轻量级的卷积主干和锚分类器。具体而言,我们将铁路检测过程表述为基于行选择问题。这种策略相比于替代的分割方法降低了计算成本。(iii) 我们在 Rail-DB 上进行了广泛的实验,包括跨场景设置和网络主干从 ResNet 到 Vision Transformer。我们的方法在速度和准确性方面表现出了良好的性能。值得注意的是,轻量级版本可以实现 92.77% 的准确度和 312 帧每秒。Rail-Net 的性能优于传统方法 50.65%,优于分割方法 5.86%。数据库和代码可从以下网址获得:https://github.com/Sampson-Lee/Rail-Detection。