利用视频字幕促进体育活动 (The Use of Video Captioning for Fostering Physical Activity)

Video Captioning is considered to be one of the most challenging problems in the field of computer vision. Video Captioning involves the combination of different deep learning models to perform object detection, action detection, and localization by processing a sequence of image frames. It is crucial to consider the sequence of actions in a video in order to generate a meaningful description of the overall action event. A reliable, accurate, and real-time video captioning method can be used in many applications. However, this paper focuses on one application: video captioning for fostering and facilitating physical activities. In broad terms, the work can be considered to be assistive technology. Lack of physical activity appears to be increasingly widespread in many nations due to many factors, the most important being the convenience that technology has provided in workplaces. The adopted sedentary lifestyle is becoming a significant public health issue. Therefore, it is essential to incorporate more physical movements into our daily lives. Tracking one's daily physical activities would offer a base for comparison with activities performed in subsequent days. With the above in mind, this paper proposes a video captioning framework that aims to describe the activities in a video and estimate a person's daily physical activity level. This framework could potentially help people trace their daily movements to reduce an inactive lifestyle's health risks. The work presented in this paper is still in its infancy. The initial steps of the application are outlined in this paper. Based on our preliminary research, this project has great merit.

翻译：视频描述被认为是计算机视觉领域最具挑战性的问题之一; 视频描述涉及各种深层次学习模型的结合,以便通过处理一系列图像框架进行物体探测、行动探测和本地化; 关键是要考虑视频中行动的顺序,以便产生对整个行动事件的有意义的描述; 在许多应用中,可以使用可靠、准确和实时的视频描述方法; 然而,本文侧重于一个应用程序:促进和便利体育活动的视频字幕; 从广义上讲,工作可被视为辅助技术; 在许多国家,由于多种因素,缺乏物理活动似乎越来越普遍,这其中最重要的是技术在工作场所提供的便利; 采纳的固定生活方式正在成为一个重要的公共健康问题; 因此,有必要将更多的物理运动纳入我们的日常生活; 跟踪一个人的日常体育活动将为比较几天后开展的活动提供一个基础; 考虑到以上,本文建议了一个视频描述框架,目的是用视频描述各项活动,并评估一个人的初始活动,因为许多因素,最重要的是技术在工作场所提供的便利; 采纳的固定生活方式正在成为一个重大的公共健康问题; 因此,跟踪一个人的日常物理活动,这一初步研究框架有可能帮助人们在日常活动上减少其基本活动。