Yoga is a globally acclaimed and widely recommended practice for a healthy living. Maintaining correct posture while performing a Yogasana is of utmost importance. In this work, we employ transfer learning from Human Pose Estimation models for extracting 136 key-points spread all over the body to train a Random Forest classifier which is used for estimation of the Yogasanas. The results are evaluated on an in-house collected extensive yoga video database of 51 subjects recorded from 4 different camera angles. We propose a 3 step scheme for evaluating the generalizability of a Yoga classifier by testing it on 1) unseen frames, 2) unseen subjects, and 3) unseen camera angles. We argue that for most of the applications, validation accuracies on unseen subjects and unseen camera angles would be most important. We empirically analyze over three public datasets, the advantage of transfer learning and the possibilities of target leakage. We further demonstrate that the classification accuracies critically depend on the cross validation method employed and can often be misleading. To promote further research, we have made key-points dataset and code publicly available.
翻译:瑜伽是一种全球受欢迎和广泛推荐的健康生活做法。 在做Yogasana时保持正确的姿势至关重要。 在这项工作中,我们从人体巨藻估计模型中学习,提取全身分布的136个关键点,以训练用于估计Yogasanas的随机森林分类器。结果在内部收集的广泛瑜伽视频数据库中进行评估,该数据库从4个不同镜头的角度录制了51个主题。我们提出了一个三步方案,通过测试来评价瑜伽分类员的通用性:1) 隐形框架、2) 隐形主题和3) 隐形相机角度。我们争辩说,对于大多数应用中,最重要的是对隐形主题和隐形相机角度进行验证。我们对三个公共数据集、转移学习的优势和目标渗漏的可能性进行了实验性分析。我们进一步证明,分类的精度严重取决于使用的交叉验证方法,而且往往具有误导性。为了促进进一步的研究,我们已经将关键点数据设置和代码公开提供。