There has been a significant increase in the adoption of technology in cricket recently. This trend has created the problem of duplicate work being done in similar computer vision-based research works. Our research tries to solve one of these problems by segmenting ball deliveries in a cricket broadcast using deep learning models, MobileNet and YOLO, thus enabling researchers to use our work as a dataset for their research. The output from our research can be used by cricket coaches and players to analyze ball deliveries which are played during the match. This paper presents an approach to segment and extract video shots in which only the ball is being delivered. The video shots are a series of continuous frames that make up the whole scene of the video. Object detection models are applied to reach a high level of accuracy in terms of correctly extracting video shots. The proof of concept for building large datasets of video shots for ball deliveries is proposed which paves the way for further processing on those shots for the extraction of semantics. Ball tracking in these video shots is also done using a separate RetinaNet model as a sample of the usefulness of the proposed dataset. The position on the cricket pitch where the ball lands is also extracted by tracking the ball along the y-axis. The video shot is then classified as a full-pitched, good-length or short-pitched delivery.
翻译:最近,板球技术的采用最近显著增加了。这个趋势造成了类似计算机视觉研究项目中重复工作的问题。我们的研究试图通过使用深层学习模型、移动网络和YOLO在板球广播中使用深层学习模型、移动网络和YOLO进行分解球投送来解决其中的一个问题,从而使研究人员能够利用我们的工作作为研究的数据集。我们的研究产出可以被板球教练和玩家用来分析比赛期间播放的球投球。本文介绍了对区段的一种方法,并提取只投球的视频镜头。录像镜头是一系列连续框架,组成整个视频场景。物体探测模型用于在板球投影时达到高度的精确度。提出了为球投影而建立大型录像数据集的概念证明,为进一步处理这些镜头以提取语义提供了途径。这些视频镜头中的球跟踪工作也使用单独的RetinnetNet模型作为拟议数据集的有用性样本。在板板球场位置上的位置是短短的,在正确提取的镜头拍摄镜头上进行分类。