每日论文速递：计算机视觉相关（11月19日更新版）

作者：Hsu
转载自：深度学习这件小事
原文链接：

计算机视觉（11月19日更新版）

[1] Simple but Effective: CLIP Embeddings for Embodied AI作者 | Apoorv Khandelwal, Luca Weihs, Roozbeh Mottaghi, Aniruddha Kembhavi链接 | https://arxiv.org/abs/2111.09888

[2] PyTorchVideo: A Deep Learning Library for Video Understanding作者 | Haoqi Fan, Tullie Murrell, Heng Wang, et al.链接 | https://arxiv.org/abs/2111.09887 备注 | Technical report

[3] SimMIM: A Simple Framework for Masked Image Modeling作者 | Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu链接 | https://arxiv.org/abs/2111.09886

[4] Swin Transformer V2: Scaling Up Capacity and Resolution作者 | Ze Liu, Han Hu, Yutong Lin, et al.链接 | https://arxiv.org/abs/2111.09883

[5] Restormer: Efficient Transformer for High-Resolution Image Restoration作者 | Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang链接 | https://arxiv.org/abs/2111.09881

[6] One-Shot Generative Domain Adaptation作者 | Ceyuan Yang, Yujun Shen, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Zhirong Wu, Bolei Zhou链接 | https://arxiv.org/abs/2111.09876 备注 | Technical Report

[7] Postdisaster image-based damage detection and repair cost estimation of reinforced concrete buildings using dual convolutional neural networks作者 | Xiao Pan, T.Y. Yang链接 | https://arxiv.org/abs/2111.09862

[8] Edge-preserving Domain Adaptation for semantic segmentation of Medical Images作者 | Thong Vo, Naimul Khan链接 | https://arxiv.org/abs/2111.09847

[9] TransMix: Attend to Mix for Vision Transformers作者 | Jie-Neng Chen, Shuyang Sun, Ju He, Philip Torr, Alan Yuille, Song Bai链接 | https://arxiv.org/abs/2111.09833 项目链接 | https://github.com/Beckschen/TransMix

[10] LiDAR Cluster First and Camera Inference Later: A New Perspective Towards Autonomous Driving作者 | Jiyang Chen, Simon Yu, Rohan Tabish, Ayoosh Bansal, Shengzhong Liu, Tarek Abdelzaher, Lui Sha链接 | https://arxiv.org/abs/2111.09799

[11] Boosting Supervised Learning Performance with Co-training作者 | Xinnan Du, William Zhang, Jose M. Alvarez链接 | https://arxiv.org/abs/2111.09797 备注 | 2021 IEEE Intelligent Vehicles Symposium

[12] Wiggling Weights to Improve the Robustness of Classifiers作者 | Sadaf Gulshad, Ivan Sosnovik, Arnold Smeulders链接 | https://arxiv.org/abs/2111.09779 备注 | arXiv admin note: text overlap with arXiv:2103.11372, arXiv:2107.09391

[13] The Way to my Heart is through Contrastive Learning: Remote Photoplethysmography from Unlabelled Video作者 | John Gideon, Simon Stent链接 | https://arxiv.org/abs/2111.09748 项目链接 | https://github.com/ToyotaResearchInstitute/RemotePPG

[14] Interactive segmentation using U-Net with weight map and dynamic user interactions作者 | Ragavie Pirabaharan, Naimul Khan链接 | https://arxiv.org/abs/2111.09740

[15] ClipCap: CLIP Prefix for Image Captioning作者 | Ron Mokady, Amir Hertz, Amit H. Bermano链接 | https://arxiv.org/abs/2111.09734

[16] Perceiving and Modeling Density is All You Need for Image Dehazing作者 | Tian Ye, Mingchao Jiang, Yunchen Zhang, Liang Chen, Erkang Chen, Pen Chen, Zhiyong Lu链接 | https://arxiv.org/abs/2111.09733

[17] SUB-Depth: Self-distillation and Uncertainty Boosting Self-supervised Monocular Depth Estimation作者 | Hang Zhou, Sarah Taylor, David Greenwood链接 | https://arxiv.org/abs/2111.09692

[18] Evaluating Transformers for Lightweight Action Recognition作者 | Raivo Koot, Markus Hennerbichler, Haiping Lu链接 | https://arxiv.org/abs/2111.09641 备注 | pre-print

[19] Automatic Neural Network Pruning that Efficiently Preserves the Model Accuracy作者 | Thibault Castells, Seul-Ki Yeom链接 | https://arxiv.org/abs/2111.09635 备注 | 10 pages, 4 figures, 4 tables, under review in CVPR2022

[20] IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration作者 | Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao链接 | https://arxiv.org/abs/2111.09624 备注 | Technical report

[21] SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking作者 | Ziqi Pang, Zhichao Li, Naiyan Wang链接 | https://arxiv.org/abs/2111.09621

[22] Robust Person Re-identification with Multi-Modal Joint Defence作者 | Yunpeng Gong, Lifei Chen链接 | https://arxiv.org/abs/2111.09571

[23] Adaptive Shrink-Mask for Text Detection作者 | Chuang Yang, Mulin Chen, Yuan Yuan, Qi Wang, Xuelong Li链接 | https://arxiv.org/abs/2111.09560

[24] Deep neural networks-based denoising models for CT imaging and their efficacy作者 | Prabhat KC, Rongping Zeng, M. Mehdi Farhangi, Kyle J. Myers链接 | https://arxiv.org/abs/2111.09539 备注 | 13 pages, 9 figures, SPIE proceeding

[25] Learning Modified Indicator Functions for Surface Reconstruction作者 | Dong Xiao, Siyou Lin, Zuoqiang Shi, Bin Wang链接 | https://arxiv.org/abs/2111.09526 备注 | Accepted by Computers & Graphics from SMI 2021

[26] RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation作者 | Yantao Lu, Xuetao Hao, Shiqi Sun, Weiheng Chai, Muchenxuan Tong, Senem Velipasalar链接 | https://arxiv.org/abs/2111.09515

[27] Blind VQA on 360° Video via Progressively Learning from Pixels, Frames and Video作者 | Li Yang, Mai Xu, Shengxi Li, Yichen Guo, Zulin Wang链接 | https://arxiv.org/abs/2111.09503 备注 | Under review

[28] Dynamically pruning segformer for efficient semantic segmentation作者 | Haoli Bai, Hongda Mao, Dinesh Nair链接 | https://arxiv.org/abs/2111.09499

[29] Developing a Machine Learning Algorithm-Based Classification Models for the Detection of High-Energy Gamma Particles作者 | Emmanuel Dadzie, Kelvin Kwakye链接 | https://arxiv.org/abs/2111.09496

[30] Reference-based Magnetic Resonance Image Reconstruction Using Texture Transforme作者 | Pengfei Guo, Vishal M. Patel链接 | https://arxiv.org/abs/2111.09492

[31] 3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions作者 | Jie Zhang, Robert B. Fisher链接 | https://arxiv.org/abs/2111.09485

[32] Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes作者 | Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, Ran Xu, Wenhao Liu, Caiming Xiong链接 | https://arxiv.org/abs/2111.09452

[33] Efficient deep learning models for land cover image classification作者 | Ioannis Papoutsis, Nikolaos-Ioannis Bountos, Angelos Zavras, Dimitrios Michail, Christos Tryfonopoulos链接 | https://arxiv.org/abs/2111.09451 备注 | This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

[34] See Eye to Eye: A Lidar-Agnostic 3D Detection Framework for Unsupervised Multi-Target Domain Adaptation作者 | Darren Tsai, Julie Stephany Berrio, Mao Shan, Stewart Worrall, Eduardo Nebot链接 | https://arxiv.org/abs/2111.09450

[35] Rethinking Drone-Based Search and Rescue with Aerial Person Detection作者 | Pasi Pyrrö, Hassan Naseri, Alexander Jung链接 | https://arxiv.org/abs/2111.09406 备注 | 10 pages, 5 figures, 3 tables, 1 algorithm

[36] Fine-Grained Vehicle Classification in Urban Traffic Scenes using Deep Learning作者 | Syeda Aneeba Najeeb, Rana Hammad Raza, Adeel Yusuf, Zamra Sultan链接 | https://arxiv.org/abs/2111.09403

[37] DeepCurrents: Learning Implicit Representations of Shapes with Boundaries作者 | David Palmer, Dmitriy Smirnov, Stephanie Wang, Albert Chern, Justin Solomon链接 | https://arxiv.org/abs/2111.09383

[38] MPF6D: Masked Pyramid Fusion 6D Pose Estimation作者 | Nuno Pereira, Luís A. Alexandre链接 | https://arxiv.org/abs/2111.09378

[39] Temporally Consistent Online Depth Estimation in Dynamic Scenes作者 | Zhaoshuo Li, Wei Ye, Dilin Wang, Francis X. Creighton, Russell H. Taylor, Ganesh Venkatesh, Mathias Unberath链接 | https://arxiv.org/abs/2111.09337

[40] Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning作者 | Christopher Hoang, Sungryull Sohn, Jongwook Choi, Wilka Carvalho, Honglak Lee链接 | https://arxiv.org/abs/2111.09858 项目链接 | https://2016choang.github.io/sfl备注 | NeurIPS 2021.

[41] Exploring the Limits of Epistemic Uncertainty Quantification in Low-Shot Settings作者 | Matias Valdenegro-Toro链接 | https://arxiv.org/abs/2111.09808 备注 | 7 pages, 3 figures, with supplementary material. LatinX in AI Research Workshop @ NeurIPS 2021

[42] Unsupervised Online Learning for Robotic Interestingness with Visual Memory作者 | Chen Wang, Yuheng Qiu, Wenshan Wang, Yafei Hu, Seungchan Kim, Sebastian Scherer链接 | https://arxiv.org/abs/2111.09793 备注 | Accepted to The IEEE Transactions on Robotics (T-RO). arXiv admin note: substantial text overlap with arXiv:2005.08829

[43] A Trainable Spectral-Spatial Sparse Coding Model for Hyperspectral Image Restoration作者 | Théo Bodrito, Alexandre Zouaoui, Jocelyn Chanussot, Julien Mairal链接 | https://arxiv.org/abs/2111.09708

[44] Visual design intuition: Predicting dynamic properties of beams from raw cross-section images作者 | Philippe M. Wyder, Hod Lipson链接 | https://arxiv.org/abs/2111.09701 备注 | Accepted for publication in Journal Of The Royal Society Interface

[45] Casting graph isomorphism as a point set registration problem using a simplex embedding and sampling作者 | Yigit Oktar链接 | https://arxiv.org/abs/2111.09696

[46] Towards Intelligibility-Oriented Audio-Visual Speech Enhancement作者 | Tassadaq Hussain, Mandar Gogate, Kia Dashtipour, Amir Hussain链接 | https://arxiv.org/abs/2111.09642 备注 | 6 pages, 4 figures

[47] Recurrent Variational Network: A Deep Learning Inverse Problem Solver applied to the task of Accelerated MRI Reconstruction作者 | George Yiasemis, Clara I. Sánchez, Jan-Jakob Sonke, Jonas Teuwen链接 | https://arxiv.org/abs/2111.09639

[48] Improving Transferability of Representations via Augmentation-Aware Self-Supervision作者 | Hankook Lee, Kibok Lee, Kimin Lee, Honglak Lee, Jinwoo Shin链接 | https://arxiv.org/abs/2111.09613 备注 | Accepted to NeurIPS 2021

[49] Lidar with Velocity: Motion Distortion Correction of Point Clouds from Oscillating Scanning Lidars作者 | Wen Yang, Zheng Gong, Baifu Huang, Xiaoping Hong链接 | https://arxiv.org/abs/2111.09497

[50] Self-Attending Task Generative Adversarial Network for Realistic Satellite Image Creation作者 | Nathan Toner, Justin Fletcher链接 | https://arxiv.org/abs/2111.09463 备注 | to be published in IEEE Aerospace 2022

[51] Large-scale Building Height Retrieval from Single SAR Imagery based on Bounding Box Regression Networks作者 | Yao Sun, Lichao Mou, Yuanyuan Wang, Sina Montazeri, Xiao Xiang Zhu链接 | https://arxiv.org/abs/2111.09460

[52] Low Precision Decentralized Distributed Training with Heterogeneous Data作者 | Sai Aparna Aketi, Sangamesh Kodge, Kaushik Roy链接 | https://arxiv.org/abs/2111.09389

机器学习/深度学习算法/自然语言处理交流群

已建立机器学习算法-自然语言处理微信交流群！想要进交流群进行学习的同学，可以直接加我的微信号：HIT_NLP。加的时候备注一下：知乎+学校+昵称（不加备注不会接受同意，望谅解），想进pytorch群，备注知乎+学校+昵称+Pytorch即可。然后我们就可以拉你进群了。群里已经有非得多国内外高校同学，交流氛围非常好。

强烈推荐大家关注机器学习算法与自然语言处理账号和机器学习算法与自然语言处理微信公众号，可以快速了解到最新优质的干货资源。

机器学习/深度学习算法/自然语言处理交流群

推荐阅读