超分辨率在基于外貌的注视估计中的应用 (Toward Super-Resolution for Appearance-Based Gaze Estimation) - 专知论文

会员服务 ·

0

state-of-the-art · 估计/估计量 · Learning · SR · 可约的 ·

2023 年 3 月 17 日

Toward Super-Resolution for Appearance-Based Gaze Estimation

翻译：超分辨率在基于外貌的注视估计中的应用

Galen O'Shea,Majid Komeili

Gaze tracking is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze tracking software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution has been shown to improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze tracking. We show that not all SR models preserve the gaze direction. We propose a two-step framework based on SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze prediction. Self-supervised learning aims to learn from unlabelled data to reduce the amount of required labeled data for downstream tasks. We propose a novel architecture called SuperVision by fusing an SR backbone network to a ResNet18 (with some skip connections). The proposed SuperVision method uses 5x less labeled data and yet outperforms, by 15%, the state-of-the-art method of GazeTR which uses 100% of training data.

翻译：注视追踪是一种有广泛应用的有价值的工具，涉及医学、心理学、虚拟现实、营销和安全等众多领域。因此，必须拥有具有成本效益和高性能的注视追踪软件。准确预测注视仍然是个困难的任务，特别是在受到运动模糊、视频压缩和噪声影响的现实环境中。超分辨率已被证明可以从视觉上改善图像质量。本研究检验了超分辨率对改善基于外貌的注视追踪的效用。我们发现不是所有SR模型都能保持注视方向。我们提出了一个基于SwinIR超分辨率模型的两步框架。所提出的方法在特别是低分辨率或受到破坏的图像场景中一直优于最先进的解决方案。此外，我们从自监督学习的角度研究了超分辨率的用途，以进行注视预测。自监督学习旨在从未带标签的数据中学习，以减少下游任务所需的标记数据量。我们提出了一种名为SuperVision的新颖架构，将SR主干网络融合到ResNet18（带一些跳过连接）中。所提出的SuperVision方法使用的标记数据量比使用100%训练数据的GazeTR方法少5倍，但表现优于后者15%。

0

相关内容

state-of-the-art

state-of-the-art

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【CGAN论文笔记强烈推荐】基于CGAN的人脸深度图估计： Face Depth Estimation With Conditional Generative Adversarial Networks

专知会员服务

24+阅读 · 2020年1月8日

【超分辨率| 2019最新综述】图像超分辨率的深度学习，附PDF（Deep Learning for Image Super-resolution: A Survey）

【超分辨率| 2019最新综述】图像超分辨率的深度学习，附PDF（Deep Learning for Image Super-resolution: A Survey）

专知会员服务

60+阅读 · 2019年11月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

SIGGRAPH Asia 2022 | 港中文MMLab：Marker Correspondence估计框架和应用

SIGGRAPH Asia 2022 | 港中文MMLab：Marker Correspondence估计框架和应用

PaperWeekly

0+阅读 · 2022年10月7日

CVPR 2020 论文大盘点-图像增强与图像恢复篇

CVPR 2020 论文大盘点-图像增强与图像恢复篇

计算机视觉life

36+阅读 · 2020年7月10日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

泡泡机器人SLAM

11+阅读 · 2019年1月4日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

基于同场景多源数据先验信息的遥感图像半盲恢复研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于视觉注意机制的SAR图像小目标检测方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

Pictet–Spengler类反应机理的理论研究和新反应设计

国家自然科学基金

0+阅读 · 2013年12月31日

基于合成光学孔径的延长光学相干层析成像焦深的方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

自然场景下变分光流计算的边缘分割与遮挡问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

带限信号压缩感知重建及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于结构化稀疏的大场景高分辨SAR图像压缩感知

国家自然科学基金

0+阅读 · 2012年12月31日

基于计算摄影的运动模糊清晰化方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

压缩感知框架下多视光学遥感影像超分辨率重建方法

国家自然科学基金

0+阅读 · 2011年12月31日

视频监控中活动人物的视觉理解

国家自然科学基金

1+阅读 · 2009年12月31日

EFE: End-to-end Frame-to-Gaze Estimation

Arxiv

0+阅读 · 2023年5月9日

A Mountain-Shaped Single-Stage Network for Accurate Image Restoration

Arxiv

1+阅读 · 2023年5月9日

Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts

Arxiv

0+阅读 · 2023年5月8日

Textured Mesh Quality Assessment: Large-Scale Dataset and Deep Learning-based Quality Metric

Arxiv

0+阅读 · 2023年5月8日

Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video

Arxiv

0+阅读 · 2023年5月8日

ARDIE: AR, Dialogue, and Eye Gaze Policies for Human-Robot Collaboration

Arxiv

0+阅读 · 2023年5月8日

Simulation of Dynamic Environments for SLAM

Arxiv

0+阅读 · 2023年5月7日

Recovering 3D Human Mesh from Monocular Images: A Survey

Arxiv

12+阅读 · 2022年3月8日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

VIP会员

文章信息

相关主题

state-of-the-art

估计/估计量

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【CGAN论文笔记强烈推荐】基于CGAN的人脸深度图估计： Face Depth Estimation With Conditional Generative Adversarial Networks

专知会员服务

24+阅读 · 2020年1月8日

【超分辨率| 2019最新综述】图像超分辨率的深度学习，附PDF（Deep Learning for Image Super-resolution: A Survey）

【超分辨率| 2019最新综述】图像超分辨率的深度学习，附PDF（Deep Learning for Image Super-resolution: A Survey）

专知会员服务

60+阅读 · 2019年11月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

SIGGRAPH Asia 2022 | 港中文MMLab：Marker Correspondence估计框架和应用

SIGGRAPH Asia 2022 | 港中文MMLab：Marker Correspondence估计框架和应用

PaperWeekly

0+阅读 · 2022年10月7日

CVPR 2020 论文大盘点-图像增强与图像恢复篇

CVPR 2020 论文大盘点-图像增强与图像恢复篇

计算机视觉life

36+阅读 · 2020年7月10日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

泡泡机器人SLAM

11+阅读 · 2019年1月4日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

EFE: End-to-end Frame-to-Gaze Estimation

Arxiv

0+阅读 · 2023年5月9日

A Mountain-Shaped Single-Stage Network for Accurate Image Restoration

Arxiv

1+阅读 · 2023年5月9日

Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts

Arxiv

0+阅读 · 2023年5月8日

Textured Mesh Quality Assessment: Large-Scale Dataset and Deep Learning-based Quality Metric

Arxiv

0+阅读 · 2023年5月8日

Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video

Arxiv

0+阅读 · 2023年5月8日

ARDIE: AR, Dialogue, and Eye Gaze Policies for Human-Robot Collaboration

Arxiv

0+阅读 · 2023年5月8日

Simulation of Dynamic Environments for SLAM

Arxiv

0+阅读 · 2023年5月7日

Recovering 3D Human Mesh from Monocular Images: A Survey

Arxiv

12+阅读 · 2022年3月8日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

相关基金

基于同场景多源数据先验信息的遥感图像半盲恢复研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于视觉注意机制的SAR图像小目标检测方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

Pictet–Spengler类反应机理的理论研究和新反应设计

国家自然科学基金

0+阅读 · 2013年12月31日

基于合成光学孔径的延长光学相干层析成像焦深的方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

自然场景下变分光流计算的边缘分割与遮挡问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

带限信号压缩感知重建及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于结构化稀疏的大场景高分辨SAR图像压缩感知

国家自然科学基金

0+阅读 · 2012年12月31日

基于计算摄影的运动模糊清晰化方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

压缩感知框架下多视光学遥感影像超分辨率重建方法

国家自然科学基金

0+阅读 · 2011年12月31日

视频监控中活动人物的视觉理解

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员