VIP内容

文链接:https://arxiv.org/pdf/2009.14794.pdf

Performer 使用一个高效的(线性)广义注意力框架(generalized attention framework),允许基于不同相似性度量(核)的一类广泛的注意力机制。该框架通过谷歌的新算法 FAVOR+( Fast Attention Via Positive Orthogonal Random Features)来实现,后者能够提供注意力机制的可扩展低方差、无偏估计,这可以通过随机特征图分解(常规 softmax-attention)来表达。该方法在保持线性空间和时间复杂度的同时准确率也很有保证,也可以应用到独立的 softmax 运算。此外,该方法还可以和可逆层等其他技术进行互操作。

研究者表示,他们相信该研究为注意力、Transformer 架构和核方法提供了一种新的思维方式。

代码地址:https://github.com/google-research/google-research/tree/master/performer

论文公布之后,Youtube 知名深度学习频道 Yannic Kilcher 对该文章进行了解读。

成为VIP会员查看完整内容
0
31

热门内容

Deep neural networks have been able to outperform humans in some cases like image recognition and image classification. However, with the emergence of various novel categories, the ability to continuously widen the learning capability of such networks from limited samples, still remains a challenge. Techniques like Meta-Learning and/or few-shot learning showed promising results, where they can learn or generalize to a novel category/task based on prior knowledge. In this paper, we perform a study of the existing few-shot meta-learning techniques in the computer vision domain based on their method and evaluation metrics. We provide a taxonomy for the techniques and categorize them as data-augmentation, embedding, optimization and semantics based learning for few-shot, one-shot and zero-shot settings. We then describe the seminal work done in each category and discuss their approach towards solving the predicament of learning from few samples. Lastly we provide a comparison of these techniques on the commonly used benchmark datasets: Omniglot, and MiniImagenet, along with a discussion towards the future direction of improving the performance of these techniques towards the final goal of outperforming humans.

0
57
下载
预览

最新内容

The research in image quality assessment (IQA) has a long history, and significant progress has been made by leveraging recent advances in deep neural networks (DNNs). Despite high correlation numbers on existing IQA datasets, DNN-based models may be easily falsified in the group maximum differentiation (gMAD) competition with strong counterexamples being identified. Here we show that gMAD examples can be used to improve blind IQA (BIQA) methods. Specifically, we first pre-train a DNN-based BIQA model using multiple noisy annotators, and fine-tune it on multiple subject-rated databases of synthetically distorted images, resulting in a top-performing baseline model. We then seek pairs of images by comparing the baseline model with a set of full-reference IQA methods in gMAD. The resulting gMAD examples are most likely to reveal the relative weaknesses of the baseline, and suggest potential ways for refinement. We query ground truth quality annotations for the selected images in a well controlled laboratory environment, and further fine-tune the baseline on the combination of human-rated images from gMAD and existing databases. This process may be iterated, enabling active and progressive fine-tuning from gMAD examples for BIQA. We demonstrate the feasibility of our active learning scheme on a large-scale unlabeled image set, and show that the fine-tuned method achieves improved generalizability in gMAD, without destroying performance on previously trained databases.

0
0
下载
预览

最新论文

The research in image quality assessment (IQA) has a long history, and significant progress has been made by leveraging recent advances in deep neural networks (DNNs). Despite high correlation numbers on existing IQA datasets, DNN-based models may be easily falsified in the group maximum differentiation (gMAD) competition with strong counterexamples being identified. Here we show that gMAD examples can be used to improve blind IQA (BIQA) methods. Specifically, we first pre-train a DNN-based BIQA model using multiple noisy annotators, and fine-tune it on multiple subject-rated databases of synthetically distorted images, resulting in a top-performing baseline model. We then seek pairs of images by comparing the baseline model with a set of full-reference IQA methods in gMAD. The resulting gMAD examples are most likely to reveal the relative weaknesses of the baseline, and suggest potential ways for refinement. We query ground truth quality annotations for the selected images in a well controlled laboratory environment, and further fine-tune the baseline on the combination of human-rated images from gMAD and existing databases. This process may be iterated, enabling active and progressive fine-tuning from gMAD examples for BIQA. We demonstrate the feasibility of our active learning scheme on a large-scale unlabeled image set, and show that the fine-tuned method achieves improved generalizability in gMAD, without destroying performance on previously trained databases.

0
0
下载
预览
Top