ECCV22 最新54篇论文分方向整理｜包含Transformer、图像处理、人脸等（附下载）

2022 年 7 月 29 日 极市平台

↑ 点击蓝字关注极市平台

本文首发极市平台公众号，转载请获得授权并标明出处。

极市导读

最近一周，ECCV2022陆续放出了更多和GAN，Transformers相关的论文，为了让大家及时获取和学习到计算机视觉前沿技术，极市对最新一批论文进行整理（新增54篇），包括分研究方向的论文及代码汇总。 >>极市七夕粉丝福利活动：搞科研的日子是364天，但七夕只有一天！

ECCV 2022 论文分方向整理目前在极市社区持续更新中，已累计更新了108篇，项目地址：https://github.com/extreme-assistant/ECCV2022-Paper-Code-Interpretation

以下是本周更新的 ECCV 2022 论文，包含图像处理，Transformers，人脸，弱监督学习，模型训练泛化等方向。

点击 阅读原文 即可打包下载。

检测

2D目标检测

[1] Multimodal Object Detection via Probabilistic Ensembling (基于概率集成的多模态目标检测) ( Oral)

paper：https://arxiv.org/abs/2104.02904
code：https://github.com/Jamie725/RGBT-detection

3D目标检测

[1] Densely Constrained Depth Estimator for Monocular 3D Object Detection (用于单目 3D 目标检测的密集约束深度估计器)

paper：https://arxiv.org/abs/2207.10047
code：https://github.com/bravegroup/dcd

人物交互检测

[1] Discovering Human-Object Interaction Concepts via Self-Compositional Learning (通过自组合学习发现人-物交互概念)

paper：https://arxiv.org/abs/2203.14272
code：https://github.com/zhihou7/scl; https://github.com/zhihou7/HOI-CL

分割

实例分割

[1] In Defense of Online Models for Video Instance Segmentation (为视频实例分割的在线模型辩护) ( Oral)

paper：https://arxiv.org/abs/2207.10661
code：https://github.com/wjf5203/vnext

图像处理

超分辨率

[1] Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution (学习高效图像超分辨率的串并行查找表)

paper：https://arxiv.org/abs/2207.12987
code：https://github.com/zhjy2016/splut

[2] Efficient Meta-Tuning for Content-aware Neural Video Delivery (内容感知神经视频交付的高效元调整)

paper：https://arxiv.org/abs/2207.09691
code：https://github.com/neural-video-delivery/emt-pytorch-eccv2022

图像复原/图像增强/图像重建

[1] Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression (无监督夜间图像增强：当层分解遇到光效抑制时)

paper：https://arxiv.org/abs/2207.10564
code：https://github.com/jinyeying/night-enhancement

[2] Bringing Rolling Shutter Images Alive with Dual Reversed Distortion(通过双重反转失真使滚动快门图像重现) ( Oral)

paper：https://arxiv.org/abs/2203.06451
code：https://github.com/zzh-tech/dual-reversed-rs

[3] Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression (无监督夜间图像增强：当层分解遇到光效抑制时)

paper：https://arxiv.org/abs/2207.10564
code：https://github.com/jinyeying/night-enhancement

图像去阴影/去反射

[1] Deep Portrait Delighting (深度人像去光)

paper：https://arxiv.org/abs/2203.12088)

图像去噪

[1] Perceiving and Modeling Density is All You Need for Image Dehazing (感知和建模密度是图像去雾所需的全部) ( Oral)

paper：https://arxiv.org/abs/2111.09733
code：https://github.com/Owen718/Perceiving-and-Modeling-Density-is-All-You-Need-for-Image-Dehazing

[2] Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance (来自模糊的动画：具有运动引导的多模态模糊分解)

paper：https://arxiv.org/abs/2207.10123
code：https://github.com/zzh-tech/Animation-from-Blur

视频处理

视频修复

[1] Error Compensation Framework for Flow-Guided Video Inpainting (流引导视频修复的误差补偿框架)

paper：https://arxiv.org/abs/2207.10391

视频去模糊

[1] Event-guided Deblurring of Unknown Exposure Time Videos (未知曝光时间视频的事件引导去模糊) ( Oral)

paper：https://arxiv.org/abs/2112.06988

[2] Efficient Video Deblurring Guided by Motion Magnitude (由运动幅度引导的高效视频去模糊)

paper：https://arxiv.org/abs/2207.13374
code：https://github.com/sollynoay/mmp-rnn

图像&视频检索/视频理解

行为识别/行为识别/动作识别/检测/分割

[1] GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality (GaitEdge：超越普通的端到端步态识别，提高实用性)

paper：https://arxiv.org/abs/2203.03972
code：https://github.com/shiqiyu/opengait

[2] Collaborating Domain-shared and Target-specific Feature Clustering for Cross-domain 3D Action Recognition (用于跨域 3D 动作识别的协作域共享和特定于目标的特征聚类)

paper：https://arxiv.org/abs/2207.09767
code：https://github.com/canbaoburen/CoDT

行人重识别/检测

[1] PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification(PASS：用于人员重新识别的部分感知自我监督预训练)

paper：https://arxiv.org/abs/2203.03931
code：https://github.com/casia-iva-lab/pass-reid

图像/视频检索

[1] Feature Representation Learning for Unsupervised Cross-domain Image Retrieval (无监督跨域图像检索的特征表示学习)

paper：https://arxiv.org/abs/2207.09721
code：https://github.com/conghuihu/ucdir

[2] LocVTP: Video-Text Pre-training for Temporal Localization (LocVTP：时间定位的视频文本预训练)

paper：https://arxiv.org/abs/2207.10362
code：https://github.com/mengcaopku/locvtp

估计

视觉定位/位姿估计

[1] 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal (通过手部去遮挡和移除的 3D 交互手部姿势估计)

paper：https://arxiv.org/abs/2207.11061
code：https://github.com/menghao666/hdr

[2] Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration (基于隐式空间校准的 Transformer 的弱监督目标定位)

paper：https://arxiv.org/abs/2207.10447
code：https://github.com/164140757/scm

人脸

[1] Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation (通过场景消歧实现种族无偏肤色估计)

paper：https://arxiv.org/abs/2205.03962
code：https://trust.is.tue.mpg.de/

[2] MoFaNeRF: Morphable Facial Neural Radiance Field (MoFaNeRF：可变形面部神经辐射场)

paper：https://arxiv.org/abs/2112.02308
code：https://github.com/zhuhao-nju/mofanerf

三维视觉

三维重建

[1] DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras (DiffuStereo：使用稀疏相机通过基于扩散的立体进行高质量人体重建)

paper：https://arxiv.org/abs/2207.08000

场景重建/视图合成/新视角合成

[1] Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields (Sem2NeRF：将单视图语义掩码转换为神经辐射场)

paper：https://arxiv.org/abs/2203.10821
code：https://github.com/donydchen/sem2nerf

目标跟踪

[1] Tracking Every Thing in the Wild (追踪野外的每一件事)

paper：https://arxiv.org/abs/2207.12978

文本检测/识别/理解

[1] Contextual Text Block Detection towards Scene Text Understanding (面向场景文本理解的上下文文本块检测)

paper：https://arxiv.org/abs/2207.12955

[2] PromptDet: Towards Open-vocabulary Detection using Uncurated Images (PromptDet：使用未经处理的图像进行开放词汇检测)

paper：https://arxiv.org/abs/2203.16513
code：https://github.com/fcjian/PromptDet

[3] End-to-End Video Text Spotting with Transformer (使用 Transformer 的端到端视频文本定位) ( Oral)

paper：https://arxiv.org/abs/2203.10539
code：https://github.com/weijiawu/transdetr

GAN/生成式/对抗式

[1] Learning Energy-Based Models With Adversarial Training (通过对抗训练学习基于能量的模型)

paper：https://arxiv.org/abs/2012.06568
code：https://github.com/xuwangyin/AT-EBMs

[2] Adaptive Image Transformations for Transfer-based Adversarial Attack (基于传输的对抗性攻击的自适应图像转换)

paper：https://arxiv.org/abs/2111.13844

[3] Generative Multiplane Images: Making a 2D GAN 3D-Aware (生成多平面图像：让一个2D GAN变得3D感知)

paper：https://arxiv.org/abs/2207.10642
code：https://github.com/apple/ml-gmpi

图像生成/图像合成

[1] PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation (PixelFolder：用于图像生成的高效渐进式像素合成网络)

paper：https://arxiv.org/abs/2204.00833
code：https://github.com/blinghe/pixelfolder

视觉预测

[1] D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights (D2-TPred：交通灯下轨迹预测的不连续依赖)

paper：https://arxiv.org/abs/2207.10398
code：https://github.com/vtp-tl/d2-tpred

神经网络结构设计

DNN

[1] Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips (使用 Bit Flips 对神经网络进行难以察觉的特洛伊木马攻击)

paper：https://arxiv.org/abs/2207.13417
code：https://github.com/jiawangbai/hpt

Transformer

[1] Improving Vision Transformers by Revisiting High-frequency Components (通过重新审视高频组件来改进视觉变压器)

paper：https://arxiv.org/abs/2204.00993
code：https://github.com/jiawangbai/HAT

[2] Transformer with Implicit Edges for Particle-based Physics Simulation (用于基于粒子的物理模拟的隐式边缘变压器)

paper：https://arxiv.org/abs/2207.10860
code：https://github.com/ftbabi/tie_eccv2022

[3] ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer (ScalableViT：重新思考 Vision Transformer 面向上下文的泛化)

paper：https://arxiv.org/abs/2203.10790
code：https://github.com/yangr116/scalablevit

[4] Visual Prompt Tuning (视觉提示调整)

paper：https://arxiv.org/abs/2203.12119
code：https://github.com/KMnP/vpt

图像特征提取与匹配

[1] Unsupervised Deep Multi-Shape Matching (无监督深度多形状匹配)

paper：https://arxiv.org/abs/2207.09610

视觉表征学习

[1] Object-Compositional Neural Implicit Surfaces (对象组合神经隐式曲面)

paper：https://arxiv.org/abs/2207.09686
code：https://github.com/qianyiwu/objsdf

模型训练/泛化

长尾分布

[1] Long-tailed Instance Segmentation using Gumbel Optimized Loss (使用 Gumbel 优化损失的长尾实例分割)

paper：https://arxiv.org/abs/2207.10936
code：https://github.com/kostas1515/gol)

[2] Identifying Hard Noise in Long-Tailed Sample Distribution (识别长尾样本分布中的硬噪声) ( Oral)

paper：https://arxiv.org/abs/2207.13378
code：https://github.com/yxymessi/h2e-framework

模型压缩

知识蒸馏

[1] Prune Your Model Before Distill It (在蒸馏之前修剪你的模型)

paper：https://arxiv.org/abs/2109.14960
code：https://github.com/ososos888/prune-then-distill

[2] Efficient One Pass Self-distillation with Zipf's Label Smoothing (使用 Zipf 的标签平滑实现高效的单程自蒸馏)

paper：https://arxiv.org/abs/2207.12980
code：https://github.com/megvii-research/zipfls

半监督学习/弱监督学习/无监督学习/自监督学习

[1] Acknowledging the Unknown for Multi-label Learning with Single Positive Labels (用单个正标签承认未知的多标签学习)

paper：https://arxiv.org/abs/2203.16219
code：https://github.com/correr-zhou/spml-acktheunknown

[2] W2N:Switching From Weak Supervision to Noisy Supervision for Object Detection (W2N：目标检测从弱监督切换到嘈杂监督)

paper：https://arxiv.org/abs/2207.12104
code：https://github.com/1170300714/w2n_wsod

[3] CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation (CA-SSL：用于检测和分割的与类别无关的半监督学习)

paper：https://arxiv.org/abs/2112.04966
code：https://github.com/dvlab-research/Entity

多模态学习/跨模态

视觉-语言

[1] Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting (语言问题：用于场景文本检测和识别的弱监督视觉语言预训练方法) ( Oral)

paper：https://arxiv.org/abs/2203.03911

小样本学习/零样本学习

[1] Worst Case Matters for Few-Shot Recognition (最坏情况对少数镜头识别很重要)

paper：https://arxiv.org/abs/2203.06574
code：https://github.com/heekhero/ACSR

持续学习

[1] Balancing Stability and Plasticity through Advanced Null Space in Continual Learning (通过持续学习中的高级零空间平衡稳定性和可塑性) ( Oral)

paper：https://arxiv.org/abs/2207.12061

[2] Online Continual Learning with Contrastive Vision Transformer (使用对比视觉转换器进行在线持续学习)

paper：https://arxiv.org/abs/2207.13516

模仿学习

[1] Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction (通过残差动作预测解决视觉模仿学习中的模仿问题)

paper：https://arxiv.org/abs/2207.09705

公众号后台回复“ECCV2022”获取论文分类资源下载～

‍

△点击卡片关注极市平台，获取最新CV干货

‍

极市干货

算法项目： CV工业项目落地实战｜目标检测算法上新！（年均分成5万）

实操教程：Pytorch - 弹性训练原理分析｜《CUDA C 编程指南》导读

极视角动态：极视角作为重点项目入选「2022青岛十大资本青睐企业」榜单！｜极视角发布EQP激励计划，招募优质算法团队展开多维度生态合作！｜极市AI校园大使招募

“

点击阅读原文进入CV社区

收获更多技术干货

登录查看更多

相关内容

代码

关注 10

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

近期必读的5篇顶会ICCV 2021【语义分割】相关论文和代码

专知会员服务

43+阅读 · 2021年8月20日

近期必读的6篇顶会CVPR 2021【零样本学习（ZSL）】相关论文和代码

专知会员服务

32+阅读 · 2021年6月30日

近期必读的5篇顶会CVPR 2021【图像/视频描述生成】相关论文和代码

专知会员服务

48+阅读 · 2021年4月25日

近期必读的5篇顶会CVPR 2021【视频理解】相关论文和代码

专知会员服务

38+阅读 · 2021年3月31日

近期必读的5篇顶会CVPR 2021【行为识别】相关论文和代码

专知会员服务

60+阅读 · 2021年3月17日

近期必读的5篇顶会CVPR 2021【视觉目标检测】相关论文和代码

专知会员服务

57+阅读 · 2021年3月9日

近期必读的六篇计算机视觉顶会ECCV 2020【目标检测】相关论文

专知会员服务

59+阅读 · 2020年7月7日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

ECCV22 最新54篇论文分方向整理｜包含目标检测、图像分割、监督学习等（附下载）

极市平台

1+阅读 · 2022年7月20日

CVPR'22 最新132篇论文分方向整理｜包含目标检测、图像处理、医学影像等28个方向

极市平台

1+阅读 · 2022年3月25日

CVPR'22 最新106篇论文分方向整理｜包含目标检测、动作识别、图像处理等32个方向

极市平台

0+阅读 · 2022年3月18日

CVPR 2022 最新 65 篇论文分方向整理｜包含目标检测、动作识别、人群计数等方向（附打包下载）

极市平台

0+阅读 · 2022年3月10日

CVPR2019| 05-22更新11篇论文及代码合集（含6篇oral，语义分割/车道线检测/视觉导航等）

极市平台

16+阅读 · 2019年5月22日

CVPR2019| 05-07更新14篇论文及代码合集（1篇oral，含目标检测/视频分割/目标跟踪等）

极市平台

22+阅读 · 2019年5月7日

CVPR2019| 04-30更新23篇论文及代码合集（3篇oral，含3D目标检测/语义分割/动作识别等）

极市平台

29+阅读 · 2019年4月30日

CVPR2019| 04-24更新12篇论文及代码（位姿估计/自动驾驶/GAN/图像生成等）

极市平台

11+阅读 · 2019年4月24日

CVPR2019| 04-17更新17篇论文及代码（目标检测、语义分割、损失函数、姿态估计等）

极市平台

24+阅读 · 2019年4月17日

CVPR2019论文抢鲜看！20篇涵盖目标检测、跨模态、视频处理、语义分割和姿态估计等方向论文

专知

30+阅读 · 2019年3月3日

数据驱动的人体图像语义分割研究

国家自然科学基金

4+阅读 · 2014年12月31日

医学图像范例先验构造与虚拟多模态成像方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

非受控场景下融合结构与纹理信息的人脸人耳多模态识别

国家自然科学基金

0+阅读 · 2014年12月31日

迁移学习在图像分类中的应用研究

国家自然科学基金

8+阅读 · 2013年12月31日

云环境中基于三维世界模型的图像表示与压缩

国家自然科学基金

2+阅读 · 2013年12月31日

基于多尺度各向异性方向导数核的图象角点检测和分类理论与方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于特征约束的三维光流模型的扩散张量图像配准研究

国家自然科学基金

0+阅读 · 2012年12月31日

视频三维立体化技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

数据驱动的彩色图像颜色空间建模与去噪

国家自然科学基金

1+阅读 · 2012年12月31日

基于双语文档反馈的跨语言信息检索研究

国家自然科学基金

0+阅读 · 2008年12月31日

DALLE-URBAN: Capturing the urban design expertise of large text to image transformers

Arxiv

0+阅读 · 2022年10月3日

BiT: Robustly Binarized Multi-distilled Transformer

Arxiv

0+阅读 · 2022年10月2日

TransCenter: Transformers with Dense Representations for Multiple-Object Tracking

Arxiv

0+阅读 · 2022年9月30日

Husformer: A Multi-Modal Transformer for Multi-Modal Human State Recognition

Arxiv

0+阅读 · 2022年9月30日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

Domain Representation for Knowledge Graph Embedding

Arxiv

14+阅读 · 2019年9月11日

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Arxiv

11+阅读 · 2018年7月16日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

VIP会员