【超全】CVPR 2018 收录论文所有标题列表

2018 年 5 月 27 日 新智元





  新智元推荐  

本文来源于公众号CVer和专知的整理


【新智元导读】计算机视觉最具影响力的学术会议之一的 IEEE CVPR 将于 2018 年 6 月 18 日 - 22 日在美国盐湖城召开举行。据 CVPR 官网显示,今年大会有超过 3300 篇论文投稿,其中录取 979 篇;相比去年 783 篇论文,今年增长了近 25%。本文将介绍 CVPR 2018 所有录用论文的标题, 包括每篇论文属于 oral, spotlight 还是 poster 的情况。



本文将介绍 CVPR 2018 所有录用论文的标题, 包括每篇论文属于 oral, spotlight 还是 poster 的情况。大家可以根据论文的标题去 google/baidu,即可以找到相关 pdf/github/homepage 链接。



Amusi 已经将 CVPR 2018 所有论文清单上传到 daily-paper-computer-vision 上,大家直接点击文末的 “阅读全文”,即可访问 daily-paper-computer-vision,下载 cvpr2018-paper-list.csv。


link: 

https://github.com/amusi/daily-paper-computer-vision/blob/master/2018/cvpr2018-paper-list.csv


CVPR 2018概览


CVPR 是 IEEE Conference on Computer Vision and Pattern Recognition 的缩写,即 IEEE 国际计算机视觉与模式识别会议。该会议是由 IEEE 举办的计算机视觉和模式识别领域的顶级会议。


会议的主要内容是计算机视觉与模式识别技术。CVPR 是世界顶级的计算机视觉会议(三大顶会之一,另外两个是 ICCV 和 ECCV)。本会议每年都会有固定的研讨主题,而每一年都会有公司赞助该会议并获得在会场展示的机会。


CVPR 有着较为严苛的录用标准,会议整体的录取率通常不超过 30%,而口头报告的论文比例更是不高于 5%。而会议的组织方是一个循环的志愿群体,通常在某次会议召开的三年之前通过遴选产生。CVPR 的审稿一般是双盲的,也就是说会议的审稿与投稿方均不知道对方的信息。通常某一篇论文需要由三位审稿者进行审读。最后再由会议的领域主席 (area chair) 决定论文是否可被接收。


CVPR 2018


上面简单介绍了 CVPR ,其重要性不言而喻。而本文的重点,也是各位童鞋关注的焦点就在于 CVPR 2018。我们先看一组数据:979/3303 ~= 29.6%,该数据是指 CVPR 2018 论文的收录比。


之前在知乎和各个新闻平台上都看到了 CVPR 2018 list,但都是一组纯序号,既没有属性也没有论文标题。机(wu)智(nai)的童鞋也只能去 arXiv 上 follow 最新的 paper,如果能遇见带有 CVPR 2018 标志的 paper,相信内心还有点小激动呢。


Amusi 在对知识的不断追求中,发现了 CVPR 2018 所有收录论文的名单,既包含了序号,也包含了属性(oral、spotlight 或 poster)以及最最最重要的论文标题!


有了论文标题,真的就可以为所欲为~


打开 cvpr2018-paper-list.csv,按下 crtl + F,输入要查找的内容,如 Object Detection,然后你就可以看到一篇篇关于 Object Detection 的论文啦!




然后将需要阅读的论文标题复制到 google/baidu 搜索框中,比如《An Analysis of Scale Invariance in Object Detection - SNIP》



打开最上面的链接,一般就可以成功跳转至 arXiv 的论文下载界面




授人以鱼,不如授人以鱼。上述只是 Amusi 常用小技巧,真的关公面前舞大刀了,大家可以自由发挥~


温馨提示:CVPR 2018 大会将于 2018 年 6 月 18~22 日于美国犹他州的盐湖城(Salt Lake City)举办。



link: http://cvpr2018.thecvf.com/





CVPR 2018论文列表


CVPR 2018 Accepted Papers


Single-Shot Refinement Neural Network for Object Detection

Video  Captioning via Hierarchical Reinforcement Learning

DensePose:  Multi-Person Dense Human Pose Estimation In The Wild

DensePose:  Multi-Person Dense Human Pose Estimation In The Wild

Frustum  PointNets for 3D Object Detection from RGB-D Data

Tips and  Tricks for Visual Question Answering: Learnings from the 2017 Challenge

Rethinking  the Faster R-CNN Architecture for Temporal Action Localization

Shape from  Shading through Shape Evolution

Shape from  Shading through Shape Evolution

A  High-Quality Denoising Dataset for Smartphone Cameras

Improving  Color Reproduction Accuracy in the Camera Imaging Pipeline

End-to-End  Dense Video Captioning with Masked Transformer

End-to-End  Dense Video Captioning with Masked Transformer

pOSE: Pseudo  Object Space Error for Initialization-Free Bundle Adjustment

Learning to  Segment Every Thing

Density-aware  Single Image De-raining using a Multi-stream Dense Network

Densely  Connected Pyramid Dehazing Network

Embodied  Question Answering

TieNet:  Text-Image Embedding Network for Common Thorax Disease Classification and  Reporting in Chest X-rays

TieNet:  Text-Image Embedding Network for Common Thorax Disease Classification and  Reporting in Chest X-rays

Towards  Open-Set Identity Preserving Face Synthesis

Baseline  Desensitizing In Translation Averaging

Learning  from the Deep: A Revised Underwater Image Formation Model

Context  Encoding for Semantic Segmentation

Context  Encoding for Semantic Segmentation

Deep Texture  Manifold for Ground Terrain Recognition

DS*: Tighter  Lifting-Free Convex Relaxations for Quadratic Matching Problems

Sparse,  Smart Contours to Represent and Edit Images

Every Smile  is Unique: Landmark-guided Diverse Smile Generation

Generative  Non-Rigid Shape Completion with Graph Convolutional Autoencoders

Learning a  Discriminative Prior for Blind Image Deblurring

Attentional  ShapeContextNet for Point Cloud Recognition

Learning  Superpixels with Segmentation-Aware Affinity Loss

Real-World  Repetition Estimation by Div, Grad and Curl

Real-World  Repetition Estimation by Div, Grad and Curl

Recurrent  Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for  Small Organ Segmentation

MegaDepth:  Learning Single-View Depth Prediction from Internet Photos

Learning  Intrinsic Image Decomposition from Watching the World

Learning  Intrinsic Image Decomposition from Watching the World

Don't Just  Assume; Look and Answer: Overcoming Priors for Visual Question Answering

Human-centric  Indoor Scene Synthesis Using Stochastic Grammar

Learning by  Asking Questions

Instance  Embedding Transfer to Unsupervised Video Object Segmentation

Detect-and-Track:  Efficient Pose Estimation in Videos

Self-Supervised  Adversarial Hashing Networks for Cross-Modal Retrieval

Guided  Proofreading of Automatic Segmentations for Connectomics

Augmented  Skeleton Space Transfer for Depth-based Hand Pose Estimation

Augmented  Skeleton Space Transfer for Depth-based Hand Pose Estimation

Context-aware  Synthesis for Video Frame Interpolation

2D/3D Pose  Estimation and Action Recognition using Multitask Deep Learning

NAG: Network  for Adversary Generation

LiteFlowNet:  A Lightweight Convolutional Neural Network for Optical Flow Estimation

LiteFlowNet:  A Lightweight Convolutional Neural Network for Optical Flow Estimation

Avatar-Net:  Multi-scale Zero-shot Style Transfer by Feature Decoration

Multi-view  Harmonized Bilinear Network for 3D Object Recognition

Multi-view  Harmonized Bilinear Network for 3D Object Recognition

Tangent  Convolutions for Dense Prediction in 3D

Tangent  Convolutions for Dense Prediction in 3D

Semi-parametric  Image Synthesis

Semi-parametric  Image Synthesis

Interactive  Image Segmentation with Latent Diversity

3D Hand Pose  Estimation: From Current Achievements to Future Goals

3D Hand Pose  Estimation: From Current Achievements to Future Goals

W2F: A  Weakly-Supervised to Fully-Supervised Framework for Object Detection

BlockDrop:  Dynamic Inference Paths in Residual Networks

BlockDrop:  Dynamic Inference Paths in Residual Networks

MapNet:  Geometry-Aware Learning of Maps for Camera Localization

MapNet:  Geometry-Aware Learning of Maps for Camera Localization

BPGrad:  Towards Global Optimality in Deep Learning via Branch and Pruning

Salient  Object Detection Driven by Fixation Prediction

3D Object  Detection with Latent Support Surfaces

Practical  Block-wise Neural Network Architecture Generation

Practical  Block-wise Neural Network Architecture Generation

Glimpse  Clouds: Human Activity Recognition from Unstructured Feature Points

Are You  Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning

Are You  Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning

Visual  Grounding via Accumulated Attention

Supervision-by-Registration:  An Unsupervised Approach to Improve the Precision of Facial Landmark  Detectors

ISTA-Net:  Interpretable Optimization-Inspired Deep Network for Image Compressive  Sensing

Perturbative  Neural Networks: Rethinking Convolution in CNNs

Nonlinear 3D  Face Morphable Model

Nonlinear 3D  Face Morphable Model

Neural Baby  Talk


Neural Baby  Talk


Towards Pose  Invariant Face Recognition in the Wild

MoNet: Deep  Motion Exploitation for Video Object Segmentation

Exploring  Disentangled Feature Representation Beyond Face Identification

Towards  Effective Low-bitwidth Convolutional Neural Networks

Parallel  Attention: A Unified Framework for Visual Object Discovery through Dialogs  and Queries

Learning  Facial Action Units from Web Images with Scalable Weakly Supervised  Clustering

Few-Shot  Image Recognition by Predicting Parameters from Activations

Few-Shot  Image Recognition by Predicting Parameters from Activations

Single-Shot  Object Detection with Enriched Semantics

Unifying  Identification and Context Learning for Person Recognition

Separating  Self-Expression and Visual Content in Hashtag Supervision

Multi-Cue  Correlation Filters for Robust Visual Tracking

Beyond  Trade-off: Accelerate FCN-based Face Detection with Higher Accuracy

On the  Robustness of Semantic Segmentation Models to Adversarial Attacks

PWC-Net:  CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume

PWC-Net:  CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume

Illuminant  Spectra-based Source Separation Using Flash Photography

Illuminant  Spectra-based Source Separation Using Flash Photography

Tracking  Multiple Objects Outside the Line of Sight using Speckle Imaging

Tracking  Multiple Objects Outside the Line of Sight using Speckle Imaging

Improved  Human Pose Estimation through Adversarial Data Augmentation

Generative  Adversarial Learning Towards Fast Weakly Supervised Detection

Audio to  Body Dynamics

Audio to  Body Dynamics

The  Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Frame-Recurrent  Video Super-Resolution

Deep Mutual  Learning

Real-world  Anomaly Detection in Surveillance Videos

Soccer on  Your Tabletop

Diversity  Regularized Spatiotemporal Attention for Video-based Person Re-identification

HashGAN:  Deep Learning to Hash with Pair Conditional Wasserstein GAN

Excitation  Backprop for RNNs

Dynamic-Structured  Semantic Propagation Network

Super SloMo:  High Quality Estimation of Multiple Intermediate Frames for Video  Interpolation

Super SloMo:  High Quality Estimation of Multiple Intermediate Frames for Video  Interpolation

SPLATNet:  Sparse Lattice Networks for Point Cloud Processing

SPLATNet:  Sparse Lattice Networks for Point Cloud Processing

Video  Representation Learning Using Discriminative Pooling

Attend and  Interact: Higher-Order Object Interactions for Video Understanding

Human Pose  Estimation with Parsing Induced Learner

4D Human  Body Correspondences from Panoramic Depth Maps

Recognizing  Human Actions as Evolution of Pose Estimation Maps

GraphBit:  Bitwise Interaction Mining via Deep Reinforcement Learning

Deep  Adversarial Metric Learning

Deep  Adversarial Metric Learning

Revisiting  Video Saliency: A Large-scale Benchmark and a New Model

Graph-Cut  RANSAC


Five-point  Fundamental Matrix Estimation for Uncalibrated Cameras

Hashing as  Tie-Aware Learning to Rank

Optimizing  Local Feature Descriptors for Nearest Neighbor Matching

Total  Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies

Total  Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies

Consensus  Maximization for Semantic Region Correspondences

Consensus  Maximization for Semantic Region Correspondences

ST-GAN:  Spatial Transformer Generative Adversarial Networks for Image Compositing

Motion-Guided  Cascaded Refinement Network for Video Object Segmentation

Zigzag  Learning for Weakly Supervised Object Detection

Look,  Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with  Generative Models

Look,  Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with  Generative Models

VITON: An  Image-based Virtual Try-on Network

VITON: An  Image-based Virtual Try-on Network

Cross-Domain  Self-supervised Multi-task Feature Learning Using Synthetic Game Imagery

LayoutNet:  Reconstructing the 3D Room Layout from a Single RGB Image

Thoracic  Disease Identification and Localization with Limited Supervision

Stochastic  Downsampling for Cost-Adjustable Inference and Improved Regularization in  Convolutional Networks

Learning  Pixel-level Semantic Affinity with Image-level Supervision for Weakly  Supervised Semantic Segmentation

Deep  End-to-End Time-of-Flight Imaging

Fast and  Accurate Online Video Object Segmentation via Tracking Parts

Fast and  Accurate Online Video Object Segmentation via Tracking Parts

Min-Entropy  Latent Model for Weakly Supervised Object Detection

Future Frame  Prediction for Anomaly Detection  A New Baseline

Face Aging  with Identity-Preserved Conditional Generative Adversarial Networks

Learning to  Compare: Relation Network for Few-Shot Learning

Deep Layer  Aggregation

Deep Layer  Aggregation

Style  Aggregated Network for Facial Landmark Detection

M3:  Multimodal Memory Modelling for Video Captioning

M3:  Multimodal Memory Modelling for Video Captioning

Classification  Driven Dynamic Image Enhancement

Generative  Image Inpainting with Contextual Attention

Iterative  Visual Reasoning Beyond Convolutions

Iterative  Visual Reasoning Beyond Convolutions

Dual  Attention Matching Network for Context-Aware Feature Sequence based Person  Re-Identification

Textbook  Question Answering under Teacher Guidance with Memory Networks

Textbook  Question Answering under Teacher Guidance with Memory Networks

Multi-Level  Factorisation Net for Person Re-Identification

Functional  Map of the World

Functional  Map of the World

A Two-Step  Disentanglement Method

Towards  Faster Training of Global Covariance Pooling Networks by Iterative Matrix  Square Root Normalization

Can  Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

Left-Right  Comparative Recurrent Model for Stereo Matching

Left-Right  Comparative Recurrent Model for Stereo Matching

Analytic  Expressions for Probabilistic Moments of PL-DNN with Gaussian Input

Analytic  Expressions for Probabilistic Moments of PL-DNN with Gaussian Input

Zero-Shot  Sketch-Image Hashing

Zero-Shot  Sketch-Image Hashing

Interpretable  Convolutional Neural Networks

Interpretable  Convolutional Neural Networks

Reconstructing  Thin Structures of Manifold Surfaces by Integrating Spatial Curves

Enhancing  the Spatial Resolution of Stereo Images using a Parallax Prior

Anticipating  Traffic Accidents with Adaptive Loss and Large-scale Incident DB

Generating  Synthetic X-ray Images of a Person from the Surface Geometry

Generating  Synthetic X-ray Images of a Person from the Surface Geometry

Attentive  Fashion Grammar Network for Fashion Landmark Detection and Clothing Category  Classification

Unsupervised  CCA


Discovering  Point Lights with Intensity Distance Fields

Universal  Denoising Networks : A Novel CNN-based Network Architecture for Image  Denoising

Easy  Identification from Better Constraints: Multi-Shot Person Re-Identification  from Reference Constraints

Recurrent  Pixel Embedding for Instance Grouping

Recurrent  Pixel Embedding for Instance Grouping

Recurrent  Scene Parsing with Perspective Understanding in the Loop

Learning to  Hash by Discrepancy Minimization

Fast  End-to-End Trainable Guided Filter

Disentangling  Structure and Aesthetics for Content-aware Image Completion

An Analysis  of Scale Invariance in Object Detection - SNIP

An Analysis  of Scale Invariance in Object Detection - SNIP

CSGNet:  Neural Shape Parser for Constructive Solid Geometry

Finding Tiny  Faces in the Wild with Generative Adversarial Network

Finding Tiny  Faces in the Wild with Generative Adversarial Network

SSNet: Scale  Selection Network for Online 3D Action Prediction

SSNet: Scale  Selection Network for Online 3D Action Prediction

Integrated  facial landmark localization and super-resolution of real-world very low  resolution faces in arbitrary poses with GANs

Integrated  facial landmark localization and super-resolution of real-world very low  resolution faces in arbitrary poses with GANs

The Best of  Both Worlds: Combining CNNs and Geometric Constraints for Hierarchical Motion  Segmentation

In-Place  Activated BatchNorm for Memory-Optimized Training of DNNs

Wing Loss  for Robust Facial Landmark Localisation with Convolutional Neural Networks

Deep  Cross-media Knowledge Transfer

Deep  Cross-media Knowledge Transfer

Coupled  End-to-end Transfer Learning with Generalized Fisher Information

Knowledge  Aided Consistency for Weakly Supervised Phrase Grounding

Viewpoint-aware  Attentive Multi-view Inference for Vehicle Re-identification

MatNet:  Modular Attention Network for Referring Expression Comprehension

CBMV: A  Coalesced Bidirectional Matching Volume for Disparity Estimation

NISP:  Pruning Networks using Neuron Importance Score Propagation

NISP:  Pruning Networks using Neuron Importance Score Propagation

Who Let The  Dogs Out? Modeling Dog Behavior From Visual Data

Efficient  Video Object Segmentation via Network Modulation

Learning  Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision

Feedback-prop:  Convolutional Neural Network Inference under Partial Evidence

A Memory  Network Approach for Story-based Temporal Summarization of 360?Videos

Improving  Occlusion and Hard Negative Handling for Single-Stage Object Detectors

UV-GAN:  Adversarial Facial UV Map Completion for Pose-invariant Face Recognition

Learning a  Toolchain for Image Restoration

Learning a  Toolchain for Image Restoration

Learning to  Act Properly: Predicting and Explaining Affordances from Images

Learning a  Discriminative Feature Network for Semantic Segmentation

Optimizing  Video Object Detection via a Scale-Time Lattice

ShuffleNet:  An Extremely Efficient Convolutional Neural Network for Mobile Devices

Cascaded  Pyramid Network for Multi-Person Pose Estimation

Seeing  Temporal Modulation of Lights from Standard Cameras

Point-wise  Convolutional Neural Networks

Fine-grained  Video Captioning for Sports Narrative

Fine-grained  Video Captioning for Sports Narrative

Dense 3D  Regression for Hand Pose Estimation

Missing  Slice Recovery for Tensors Using a Low-rank Model in Embedded Space

Learning  Convolutional Networks for Content-weighted Image Compression

Learning  Attentions: Residual Attentional Siamese Network for High Performance Online  Visual Tracking

Deep  Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age  Estimation

First-Person  Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations

Hand  PointNet: 3D Hand Pose Estimation using Point Sets

Hand  PointNet: 3D Hand Pose Estimation using Point Sets

Recovering  Realistic Texture in Image Super-resolution by Spatial Feature Modulation

Cube Padding  for Weakly-Supervised Saliency Prediction in 360$^{\circ}$ Videos

A Face to  Face Neural Conversation Model

SurfConv:  Bridging 3D and 2D Convolution for RGBD Images

Dynamic  Video Segmentation Network

Multiple  Granularity Group Interaction Prediction

Visual  Question Reasoning on General Dependency Tree

Visual  Question Reasoning on General Dependency Tree

From  Lifestyle VLOGs to Everyday Interactions

COCO-Stuff:  Thing and Stuff Classes in Context

GANerated  Hands for Real-Time 3D Hand Tracking from Monocular RGB

GANerated  Hands for Real-Time 3D Hand Tracking from Monocular RGB

Non-local  Neural Networks

Zero-shot  Recognition via Semantic Embeddings and Knowledge Graphs

Taskonomy:  Disentangling Task Transfer Learning

Taskonomy:  Disentangling Task Transfer Learning

Embodied  Real-World Active Perception

Embodied  Real-World Active Perception

SfSNet :  Learning Shape, Reflectance and Illuminance of Faces `in the wild'

SfSNet :  Learning Shape, Reflectance and Illuminance of Faces `in the wild'

End-to-end  Recovery of Human Shape and Pose

Factoring  Shape, Pose, and Layout from the 2D Image of a 3D Scene

Multi-view  Consistency as Supervisory Signal for Learning Shape and Pose Prediction

A Fast  Resection-Intersection Method for the Known Rotation Problem

Image  Generation from Scene Graphs

What Makes a  Video a Video: Analyzing Temporal Information in Video Understanding Models  and Datasets

What Makes a  Video a Video: Analyzing Temporal Information in Video Understanding Models  and Datasets

PointFusion:  Deep Sensor Fusion for 3D Bounding Box Estimation

High-Resolution  Image Synthesis and Semantic Manipulation with Conditional GANs

High-Resolution  Image Synthesis and Semantic Manipulation with Conditional GANs

Social GAN:  Socially Acceptable Trajectories with Generative Adversarial Networks

Quantization  and Training of Neural Networks for Efficient Integer-Arithmetic-Only  Inference

Quantization  and Training of Neural Networks for Efficient Integer-Arithmetic-Only  Inference

Finding  It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional  Video"

Finding  It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional  Video"

Unsupervised  Cross-dataset Person Re-identification by Transfer Learning of  Spatio-temporal Patterns

Kernelized  Subspace Pooling for Deep Local Descriptors

Video Rain  Removal By Multiscale Convolutional Sparse Coding

Learning  from Millions of 3D Scans for Large-scale 3D Face Recognition

Referring  Relationships

Improving  Object Localization with Fitness NMS and Bounded IoU Loss

Unsupervised  Feature Learning via Non-Parametric Instance-level Discrimination

Unsupervised  Feature Learning via Non-Parametric Instance-level Discrimination

CVM-Net:  Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization

CVM-Net:  Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization

Visual  Question Generation as Dual Task of Visual Question Answering

Visual  Question Generation as Dual Task of Visual Question Answering

Revisiting  Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised  Semantic Segmentation

Revisiting  Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised  Semantic Segmentation

Learning  Dual Convolutional Neural Networks for Low-Level Vision

Deep Video  Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit  Motion Compensation

MegDet: A  Large Mini-Batch Object Detector

MegDet: A  Large Mini-Batch Object Detector

AttnGAN:  Fine-Grained Text to Image Generation with Attentional Generative Adversarial  Networks

TOM-Net:  Learning Transparent Object Matting from a Single Image

TOM-Net:  Learning Transparent Object Matting from a Single Image

End-to-End  Deep Kronecker-Product Matching for Person Re-identification

Semantic  Visual Localization

Joint Cuts  and Matching of Partitions in One Graph

Benchmarking  6DOF Outdoor Visual Localization in Changing Conditions

Benchmarking  6DOF Outdoor Visual Localization in Changing Conditions

Crowd  Counting via Adversarial Cross-Scale Consistency Pursuit

Deep  Group-shuffling Random Walk for Person Re-identification

Learning to  Detect Features in Texture Images

Learning to  Detect Features in Texture Images

Transferable  Joint Attribute-Identity Deep Learning for Unsupervised Person  Re-Identification

CarFusion:  Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of  Vehicles

Context-aware  Deep Feature Compression for High-speed Visual Tracking

Deep  Material-aware Cross-spectral Stereo Matching

Deep Extreme  Cut: From Extreme Points to Object Segmentation

Label  Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images

Label  Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images

Harmonious  Attention Network for Person Re-Identication

Unsupervised  Deep Generative Adversarial Hashing Network

Unsupervised  Deep Generative Adversarial Hashing Network

Pseudo-Mask  Augmented Object Detection

LSTM  stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH)

LSTM  stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH)

Adversarial  Complementary Learning for Weakly Supervised Object Localization

Unsupervised  Discovery of Object Landmarks as Structural Representations

Unsupervised  Discovery of Object Landmarks as Structural Representations

DeLS-3D:  Deep Localization and Segmentation with a 3D Semantic Map

Monocular  Relative Depth Perception with Web Stereo Data Supervision

Image-Image  Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for  Person Re-identification

Objects as  context for detecting their semantic parts

Camera Style  Adaptation for Person Re-identification

Conditional  Generative Adversarial Network for Structured Domain Adaptation

Rotation-sensitive  Regression for Oriented Scene Text Detection

Residual  Parameter Transfer for Deep Domain Adaptation

SGPN:  Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation

SGPN:  Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation

Weakly  Supervised Instance Segmentation using Class Peak Response

Weakly  Supervised Instance Segmentation using Class Peak Response

Robust  Facial Landmark Detection via a Fully-Convolutional Local-Global Context  Network

Rotation  Averaging and Strong Duality

Rotation  Averaging and Strong Duality

PackNet:  Adding Multiple Tasks to a Single Network by Iterative Pruning

Im2Flow:  Motion Hallucination from Static Images for Action Recognition

Im2Flow:  Motion Hallucination from Static Images for Action Recognition

Feature  Quantization for Defending Against Distortion of Images

End-to-end  weakly-supervised semantic alignment

PointGrid: A  Deep Network for 3D Shape Understanding

PointGrid: A  Deep Network for 3D Shape Understanding

Imagine it  for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy  Texts

A Minimalist  Approach to Type-Agnostic Detection of Quadrics in Point Clouds

A Benchmark  for Articulated Human Pose Estimation and Tracking

Boosting  Self-Supervised Learning via Knowledge Transfer

PPFNet:  Global Context Aware Local Features for Robust 3D Point Matching

PPFNet:  Global Context Aware Local Features for Robust 3D Point Matching

Vision-and-Language  Navigation: Interpreting visually-grounded navigation instructions in real  environments

Vision-and-Language  Navigation: Interpreting visually-grounded navigation instructions in real  environments

Fast Video  Object Segmentation by Reference-Guided Mask Propagation

Fast Video  Object Segmentation by Reference-Guided Mask Propagation

Super-Resolving  Very Low-Resolution Face Images with Supplementary Attributes

Video Person  Re-identification with Competitive Snippet-similarity Aggregation and  Co-attentive Snippet Embedding

One-shot  Action Localization by Sequence Matching Network

Efficient  Subpixel Refinement with Symbolic Linear Predictors

Distort-and-Recover:  Color Enhancement using Deep Reinforcement Learning

Group  Consistent Similarity Learning via Deep CRFs for Person Re-Identification

Group  Consistent Similarity Learning via Deep CRFs for Person Re-Identification

Single Image  Reflection Separation with Perceptual Losses

AVA: A Video  Dataset of Spatio-temporally Localized Atomic Visual Actions

AVA: A Video  Dataset of Spatio-temporally Localized Atomic Visual Actions

Recognize  Actions by Disentangling Components of Dynamics

Zoom and  Learn: Generalizing Deep Stereo Matching to Novel Domains

Attention-aware  Compositional Network for Person Re-Identification

HATS:  Histograms of Averaged Time Surfaces for Robust Event-based Object  Classification

Mask-guided  Contrastive Attention Model for Person Re-Identification

Pose-Guided  Photorealistic Face Rotation

Pose-Guided  Photorealistic Face Rotation

Automatic 3D  Indoor Scene Modeling from Single Panorama

Automatic 3D  Indoor Scene Modeling from Single Panorama

SobolevFusion:  3D Reconstruction of Scenes Undergoing Free Non-rigid Motion

SobolevFusion:  3D Reconstruction of Scenes Undergoing Free Non-rigid Motion

A  Biresolution Spectral framework for Product Quantization

Dynamic  Zoom-in Network for Fast Object Detection in Large Images

On the  Importance of Label Quality for Semantic Segmentation

EPINET: A  Fully-Convolutional Neural Network for Light Field Depth Estimation by Using  Epipolar Geometry

A  Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross  Neighborhood Re-Ranking

Erase or  Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos

Scalable and  Effective Deep CCA via Soft Decorrelation

High-order  tensor regularization with application to attribute ranking

3D-RCNN:  Instance-level 3D Scene Understanding via Render-and-Compare

3D-RCNN:  Instance-level 3D Scene Understanding via Render-and-Compare

FoldingNet:  Interpretable Unsupervised Learning on 3D Point Clouds

FoldingNet:  Interpretable Unsupervised Learning on 3D Point Clouds

Defocus Blur  Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network

Decorrelated  Batch Normalization

Unsupervised  Textual Grounding: Linking Words to Image Concepts

Unsupervised  Textual Grounding: Linking Words to Image Concepts

Scale-recurrent  Network for Deep Image Deblurring

Low-Shot  Recognition with Imprinted Weights

Bottom-Up  and Top-Down Attention for Image Captioning and Visual Question Answering

Bottom-Up  and Top-Down Attention for Image Captioning and Visual Question Answering

Cross-Domain  Weakly-Supervised Object Detection through Progressive Domain Adaptation

Facelet-Bank  for Fast Portrait Manipulation

Duplex  Generative Adversarial Network for Unsupervised Domain Adaptation

Quantization  of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

Real-Time  Rotation-Invariant Face Detection with Progressive Calibration Networks

Structure  Preserving Video Prediction

Tagging Like  Humans: Diverse and Distinct Image Annotation

Learning to  Sketch with Shortcut Cycle Consistency

GroupCap:  Group-based Image Captioning with Structured Relevance and Diversity  Constraints

Dynamic  Scene Deblurring Using Spatially Variant Recurrent Neural Networks

Dynamic  Scene Deblurring Using Spatially Variant Recurrent Neural Networks

Hyperparameter  Optimization for Tracking with Continuous Deep Q-Learning

Deep  Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective

Deep  Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective

NeuralNetwork-Viterbi:  A Framework for Weakly Supervised Video Learning

NeuralNetwork-Viterbi:  A Framework for Weakly Supervised Video Learning

Detecting  and Recognizing Human-Object Interactions

Detecting  and Recognizing Human-Object Interactions

Augmenting  Crowd-Sourced 3D Reconstructions using Semantic Detections

Visual  Relationship Learning with a Factorization-based Prior

Re-weighted  Adversarial Adaptation Network for Unsupervised Domain Adaptation

Flow Guided  Recurrent Neural Encoder for Video Salient Object Detection

Disentangling  3D Pose in A Dendritic CNN for Unconstrained 2D Face Alignment

Progressive  Attention Guided Recurrent Network for Salient Object Detection

Answer with  Grounding Snippets: Focal Visual-Text Attention for Visual Question Answering

Answer with  Grounding Snippets: Focal Visual-Text Attention for Visual Question Answering

Unsupervised  Learning of Depth and Egomotion from Monocular Video Using 3D Geometric  Constraints

Repulsion  Loss: Detecting Pedestrians in a Crowd

PU-Net:  Point Cloud Upsampling Network

Video Object  Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF

Video Object  Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF

PiCANet:  Learning Pixel-wise Contextual Attention for Saliency Detection

Gated Fusion  Network for Single Image Dehazing

Interleaved  Structured Sparse Convolutional Neural Networks

Interleaved  Structured Sparse Convolutional Neural Networks

Where and  Why Are They Looking? Jointly Inferring Human Attention and Intentions in  Complex Tasks

End-to-end  Flow Correlation Tracking with Spatial-temporal Attention

Left/Right  Asymmetric Layer Skippable Networks

Context  Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation

Context  Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation

VITAL:  VIsual Tracking via Adversarial Learning

VITAL:  VIsual Tracking via Adversarial Learning

RotationNet:  Joint Object Categorization and Pose Estimation Using Multiviews from  Unsupervised Viewpoints

Action Sets:  Weakly Supervised Action Segmentation without Ordering Constraints

Action Sets:  Weakly Supervised Action Segmentation without Ordering Constraints

Squeeze-and-Excitation  Networks

Squeeze-and-Excitation  Networks

Edit  Probability for Scene Text Recognition

Bidirectional  Attentive Fusion with Context Gating for Dense Video Captioning

Bidirectional  Attentive Fusion with Context Gating for Dense Video Captioning

Exploit the  Unknown Gradually:~ One-Shot Video-Based Person Re-Identification by Stepwise  Learning

Learning to  Localize Sound Source in Visual Scenes

Dynamic  Few-Shot Visual Learning without Forgetting

Weakly-Supervised  Semantic Segmentation by Iteratively Mining Common Object Features

SINT++:  Robust Visual Tracking via Adversarial Hard Positive Generation

Real-Time  Monocular Depth Estimation using Synthetic Data with Domain Adaptation via  Image Style Transfer

Fast and  Accurate Single Image Super-Resolution via Information Distillation Network

Low-Latency  Video Semantic Segmentation

Low-Latency  Video Semantic Segmentation

Domain  Adaptive Faster R-CNN for Object Detection in the Wild

DoubleFusion:  Real-time Capture of Human Performance with Inner Body Shape from a Single  Depth Sensor

DoubleFusion:  Real-time Capture of Human Performance with Inner Body Shape from a Single  Depth Sensor

Lean  Multiclass Crowdsourcing

Lean  Multiclass Crowdsourcing

Tell Me  Where To Look: Guided Attention Inference Network

Tell Me  Where To Look: Guided Attention Inference Network

Residual  Dense Network for Image Super-Resolution

Residual  Dense Network for Image Super-Resolution

Look at  Boundary: A Boundary-Aware Face Alignment Algorithm

Imagination-IQA:  No-reference Image Quality Assessment via Adversarial Learning

Memory  Matching Networks for One-Shot Image Recognition

3D Human  Pose Estimation in the Wild by Adversarial Learning

Unsupervised  Training for 3D Morphable Model Regression

Unsupervised  Training for 3D Morphable Model Regression

Scalable  Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective

IQA: Visual  Question Answering in Interactive Environments

Learning  Spatial-Temporal Regularized Correlation Filters for Visual Tracking

Low-shot  Learning from Imaginary Data

Low-shot  Learning from Imaginary Data

Deep  Regression Forests for Age Estimation

Partial  Transfer Learning with Selective Adversarial Networks

Partial  Transfer Learning with Selective Adversarial Networks

A  Bi-directional Message Passing Model for Salient Object Detection

Transductive  Unbiased Embedding for Zero-Shot Learning

Scale-Transferrable  Object Detection

Crowd  Counting with Deep Negative Correlation Learning

Deep Cauchy  Hashing for Hamming Space Retrieval

Demo2Vec:  Reasoning Object Affordances from Online Videos

GVCNN:  Group-View Convolutional Neural Networks for 3D Shape Recognition

An  End-to-End TextSpotter with Explicit Alignment and Attention

Stereoscopic  Neural Style Transfer

Bootstrapping  the Performance of Webly Supervised Semantic Segmentation

Learning  Markov Clustering Networks for Scene Text Detection

Collaborative  and Adversarial Network for Unsupervised domain adaptation

Collaborative  and Adversarial Network for Unsupervised domain adaptation

Reflection  Removal for Large-Scale 3D Point Clouds

Pose  Transferrable Person Re-Identification

Learning to  Adapt Structured Output Space for Semantic Segmentation

Learning to  Adapt Structured Output Space for Semantic Segmentation

Efficient  Diverse Ensemble for Discriminative Co-Tracking

Learning a  Single Convolutional Super-Resolution Network for Multiple Degradations

Probabilistic  Plant Modeling via Multi-View Image-to-Image Translation

Learning to  Parse Wireframes in Images of Man-Made Environments

A  Variational U-Net for Conditional Appearance and Shape Generation

A  Variational U-Net for Conditional Appearance and Shape Generation

Learning to  Find Good Correspondences

Learning to  Find Good Correspondences

Actor and  Action Video Segmentation from a Sentence

Actor and  Action Video Segmentation from a Sentence

Towards a  Mathematical Understanding of the Difficulty in Learning with Feedforward  Neural Networks

Weakly-supervised  Deep Convolutional Neural Network Learning for Facial Action Unit Intensity  Estimation

Maximum  Classifier Discrepancy for Unsupervised Domain Adaptation

Maximum  Classifier Discrepancy for Unsupervised Domain Adaptation


由于微信字数限制,没有全部显示,详细 list 请查看 Amusi 整理的

https://github.com/amusi/daily-paper-computer-vision




【加入社群】


新智元 AI 技术 + 产业社群招募中,欢迎对 AI 技术 + 产业落地感兴趣的同学,加小助手微信号: aiera2015_3  入群;通过审核后我们将邀请进群,加入社群后务必修改群备注(姓名 - 公司 - 职位;专业群审核较严,敬请谅解)。


登录查看更多
6

相关内容

CVPR是IEEE Conference on Computer Vision and Pattern Recognition的缩写,即IEEE国际计算机视觉与模式识别会议。该会议是由IEEE举办的计算机视觉和模式识别领域的顶级会议。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等
【快讯】KDD2020论文出炉,216篇上榜, 你的paper中了吗?
专知会员服务
50+阅读 · 2020年5月16日
专知会员服务
60+阅读 · 2020年3月19日
100+篇《自监督学习(Self-Supervised Learning)》论文最新合集
专知会员服务
164+阅读 · 2020年3月18日
专知会员服务
109+阅读 · 2020年3月12日
【快讯】CVPR2020结果出炉,1470篇上榜, 你的paper中了吗?
AAAI2020接受论文列表,1591篇论文目录全集
专知会员服务
98+阅读 · 2020年1月12日
必读的10篇 CVPR 2019【生成对抗网络】相关论文和代码
专知会员服务
31+阅读 · 2020年1月10日
资源|Blockchain区块链中文资源阅读列表
专知会员服务
43+阅读 · 2019年11月20日
六篇 CIKM 2019 必读的【图神经网络(GNN)】长文论文
专知会员服务
37+阅读 · 2019年11月3日
CVPR 2019超全论文合集新鲜出炉!| 资源帖
重磅资料! | CVPR 2019 全部论文合集(1.37G)
AI研习社
3+阅读 · 2019年6月11日
资料 | CVPR 2019 Oral 论文精选
AI科技评论
8+阅读 · 2019年6月6日
KDD 2019放榜,接收率低至14%,你的论文中了吗?
机器之心
7+阅读 · 2019年4月30日
CVPR 2019收录论文ID公开,你上榜了吗?
AI100
3+阅读 · 2019年2月26日
CVPR 2018 最酷的十篇论文
AI研习社
6+阅读 · 2019年2月13日
计算机视觉领域顶会CVPR 2018 接受论文列表
2018计算机视觉及机器学习重要会议汇总
极市平台
15+阅读 · 2018年1月12日
A Survey on Bayesian Deep Learning
Arxiv
63+阅读 · 2020年7月2日
Deep learning for cardiac image segmentation: A review
Arxiv
21+阅读 · 2019年11月9日
A Sketch-Based System for Semantic Parsing
Arxiv
4+阅读 · 2019年9月12日
Local Relation Networks for Image Recognition
Arxiv
4+阅读 · 2019年4月25日
Arxiv
4+阅读 · 2018年11月6日
Arxiv
9+阅读 · 2018年5月7日
Arxiv
22+阅读 · 2018年2月14日
VIP会员
相关VIP内容
【快讯】KDD2020论文出炉,216篇上榜, 你的paper中了吗?
专知会员服务
50+阅读 · 2020年5月16日
专知会员服务
60+阅读 · 2020年3月19日
100+篇《自监督学习(Self-Supervised Learning)》论文最新合集
专知会员服务
164+阅读 · 2020年3月18日
专知会员服务
109+阅读 · 2020年3月12日
【快讯】CVPR2020结果出炉,1470篇上榜, 你的paper中了吗?
AAAI2020接受论文列表,1591篇论文目录全集
专知会员服务
98+阅读 · 2020年1月12日
必读的10篇 CVPR 2019【生成对抗网络】相关论文和代码
专知会员服务
31+阅读 · 2020年1月10日
资源|Blockchain区块链中文资源阅读列表
专知会员服务
43+阅读 · 2019年11月20日
六篇 CIKM 2019 必读的【图神经网络(GNN)】长文论文
专知会员服务
37+阅读 · 2019年11月3日
相关资讯
CVPR 2019超全论文合集新鲜出炉!| 资源帖
重磅资料! | CVPR 2019 全部论文合集(1.37G)
AI研习社
3+阅读 · 2019年6月11日
资料 | CVPR 2019 Oral 论文精选
AI科技评论
8+阅读 · 2019年6月6日
KDD 2019放榜,接收率低至14%,你的论文中了吗?
机器之心
7+阅读 · 2019年4月30日
CVPR 2019收录论文ID公开,你上榜了吗?
AI100
3+阅读 · 2019年2月26日
CVPR 2018 最酷的十篇论文
AI研习社
6+阅读 · 2019年2月13日
计算机视觉领域顶会CVPR 2018 接受论文列表
2018计算机视觉及机器学习重要会议汇总
极市平台
15+阅读 · 2018年1月12日
相关论文
A Survey on Bayesian Deep Learning
Arxiv
63+阅读 · 2020年7月2日
Deep learning for cardiac image segmentation: A review
Arxiv
21+阅读 · 2019年11月9日
A Sketch-Based System for Semantic Parsing
Arxiv
4+阅读 · 2019年9月12日
Local Relation Networks for Image Recognition
Arxiv
4+阅读 · 2019年4月25日
Arxiv
4+阅读 · 2018年11月6日
Arxiv
9+阅读 · 2018年5月7日
Arxiv
22+阅读 · 2018年2月14日
Top
微信扫码咨询专知VIP会员