Human pose estimation (HPE) is a key building block for developing AI-based context-aware systems inside the operating room (OR). The 24/7 use of images coming from cameras mounted on the OR ceiling can however raise concerns for privacy, even in the case of depth images captured by RGB-D sensors. Being able to solely use low-resolution privacy-preserving images would address these concerns and help scale up the computer-assisted approaches that rely on such data to a larger number of ORs. In this paper, we introduce the problem of HPE on low-resolution depth images and propose an end-to-end solution that integrates a multi-scale super-resolution network with a 2D human pose estimation network. By exploiting intermediate feature-maps generated at different super-resolution, our approach achieves body pose results on low-resolution images (of size 64x48) that are on par with those of an approach trained and tested on full resolution images (of size 640x480).
翻译:人体表面估计(HPE)是开发操作室(OR)内基于AI的上下文感知系统的关键基石。 24/7使用安装在OR天花板上的照相机的图像可能会引起对隐私的担忧,即使在RGB-D传感器所拍摄的深度图像也是如此。 仅仅使用低分辨率隐私保护图像就能解决这些关切,并有助于将依靠这些数据的计算机辅助方法扩大到更多的ORS。 在本文件中,我们提出了低分辨率图像高分辨率高分辨率图像问题,并提出了一个端对端解决方案,将多尺度的超分辨率网络与2D人类表面估计网络结合起来。通过利用不同超分辨率生成的中间地貌图,我们的方法在低分辨率图像(64x48大小)上取得了人体效果,与经过培训和测试的完整分辨率图像(640x480大小)方法相同。