LibRec 精选:EfficientNet、XLNet 论文及代码实现

2019 年 7 月 9 日 LibRec智能推荐

LibRec 精选

LibRec智能推荐 第 36 期(至2019.6.21),更新 6 篇精选内容。


最是那一低头的温柔,像一朵水莲花不胜凉风的娇羞,道一声珍重,那一声珍重里有甜蜜的忧愁。--- 徐志摩 《沙场娜拉》


说明:近期更新较慢的原因是出现了一些意外的程序错误,我们正在全力解决。


1

【论文与代码】EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

论文:https://arxiv.org/abs/1905.11946

代码1:https://github.com/lukemelas/EfficientNet-PyTorch/

代码2:https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet



摘要:

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on scaling up MobileNets and ResNet. 

To go even further, we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets

2

【代码】利用PyTorch实现的XLNet

论文:https://arxiv.org/pdf/1906.08237.pdf,

代码https://github.com/graykode/xlnet-Pytorch/




3

【课程】深度强化学习课程(StarAI:Deep Reinforcement Learning Course),一共有6个星期的课程内容。链接:https://www.starai.io/course/


4

【教程】NumPy与数据表示的可视化教程,链接:https://jalammar.github.io/visual-numpy/




近期热点论文




1. MLFriend: Interactive Prediction Task Recommendation for Event-Driven Time-Series Data

Lei Xu, Shubhra Kanti Karmaker Santu, Kalyan Veeramachaneni

https://arxiv.org/abs/1906.12348v1


Most automation in machine learning focuses on model selection and hyper parameter tuning, and many overlook the challenge of automatically defining predictive tasks. We still heavily rely on human experts to define prediction tasks, and generate labels by aggregating raw data. In this paper, we tackle the challenge of defining useful prediction problems on event-driven time-series data. We introduce MLFriend to address this challenge. MLFriend first generates all possible prediction tasks under a predefined space, then interacts with a data scientist to learn the context of the data and recommend good prediction tasks from all the tasks in the space. We evaluate our system on three different datasets and generate a total of 2885 prediction tasks and solve them. Out of these 722 were deemed useful by expert data scientists. We also show that an automatic prediction task discovery system is able to identify top 10 tasks that a user may like within a batch of 100 tasks.


2. A Tensorized Transformer for Language Modeling

Xindian Ma, Peng Zhang, Shuai Zhang, Nan Duan, Yuexian Hou, Dawei Song, Ming Zhou

https://arxiv.org/abs/1906.09777v1

Latest development of neural models has connected the encoder and decoder through a self-attention mechanism. In particular, Transformer, which is solely based on self-attention, has led to breakthroughs in Natural Language Processing (NLP) tasks. However, the multi-head attention mechanism, as a key component of Transformer, limits the effective deployment of the model to a limited resource setting. In this paper, based on the ideas of tensor decomposition and parameters sharing, we propose a novel self-attention model (namely Multi-linear attention) with Block-Term Tensor Decomposition (BTD). We test and verify the proposed attention method on three language modeling tasks (i.e., PTB, WikiText-103 and One-billion) and a neural machine translation task (i.e., WMT-2016 English-German). Multi-linear attention can not only largely compress the model parameters but also obtain performance improvements, compared with a number of language modeling approaches, such as Transformer, Transformer-XL, and Transformer with tensor train decomposition.





登录查看更多
5

相关内容

专知会员服务
60+阅读 · 2020年3月19日
专知会员服务
109+阅读 · 2020年3月12日
近期必读的7篇 CVPR 2019【视觉问答】相关论文和代码
专知会员服务
35+阅读 · 2020年1月10日
一网打尽!100+深度学习模型TensorFlow与Pytorch代码实现集合
机器学习相关资源(框架、库、软件)大列表
专知会员服务
39+阅读 · 2019年10月9日
LibRec 精选:AutoML for Contextual Bandits
LibRec智能推荐
7+阅读 · 2019年9月19日
谷歌EfficientNet缩放模型,PyTorch实现登热榜
机器学习算法与Python学习
11+阅读 · 2019年6月4日
Facebook PyText 在 Github 上开源了
AINLP
7+阅读 · 2018年12月14日
LibRec 精选:推荐系统的论文与源码
LibRec智能推荐
14+阅读 · 2018年11月29日
【跟踪Tracking】15篇论文+代码 | 中秋快乐~
专知
18+阅读 · 2018年9月24日
LibRec 精选:基于LSTM的序列推荐实现(PyTorch)
LibRec智能推荐
50+阅读 · 2018年8月27日
Learning to See Through Obstructions
Arxiv
7+阅读 · 2020年4月2日
TResNet: High Performance GPU-Dedicated Architecture
Arxiv
8+阅读 · 2020年3月30日
A Sketch-Based System for Semantic Parsing
Arxiv
4+阅读 · 2019年9月12日
Arxiv
3+阅读 · 2019年3月15日
Arxiv
4+阅读 · 2017年7月25日
VIP会员
相关资讯
LibRec 精选:AutoML for Contextual Bandits
LibRec智能推荐
7+阅读 · 2019年9月19日
谷歌EfficientNet缩放模型,PyTorch实现登热榜
机器学习算法与Python学习
11+阅读 · 2019年6月4日
Facebook PyText 在 Github 上开源了
AINLP
7+阅读 · 2018年12月14日
LibRec 精选:推荐系统的论文与源码
LibRec智能推荐
14+阅读 · 2018年11月29日
【跟踪Tracking】15篇论文+代码 | 中秋快乐~
专知
18+阅读 · 2018年9月24日
LibRec 精选:基于LSTM的序列推荐实现(PyTorch)
LibRec智能推荐
50+阅读 · 2018年8月27日
相关论文
Learning to See Through Obstructions
Arxiv
7+阅读 · 2020年4月2日
TResNet: High Performance GPU-Dedicated Architecture
Arxiv
8+阅读 · 2020年3月30日
A Sketch-Based System for Semantic Parsing
Arxiv
4+阅读 · 2019年9月12日
Arxiv
3+阅读 · 2019年3月15日
Arxiv
4+阅读 · 2017年7月25日
Top
微信扫码咨询专知VIP会员