ICLR,全称为「International Conference on Learning Representations」(国际学习表征会议),2013 年才刚刚成立了第一届。这个一年一度的会议虽然今年才办到第五届,但已经被学术研究者们广泛认可,被认为「深度学习的顶级会议」。 ICLR由位列深度学习三大巨头之二的 Yoshua Bengio 和 Yann LeCun 牵头创办。 ICLR 希望能为深度学习提供一个专业化的交流平台。但实际上 ICLR 不同于其它国际会议,得到好评的真正原因,并不只是他们二位所自带的名人光环,而在于它推行的 Open Review 评审制度。

VIP内容

我们假设好奇心是进化过程中发现的一种机制,它鼓励个体在生命早期进行有意义的探索,从而使个体接触到能够在其一生中获得高回报的经历。我们将产生好奇行为的问题表述为元学习的问题之一:一个外环将在一个好奇心机制的空间中搜索,该机制动态地适应代理的奖励信号,而一个内环将使用适应的奖励信号执行标准的强化学习。然而,目前基于神经网络权值传递的meta-RL方法只在非常相似的任务之间进行了推广。为了扩展泛化,我们提出使用元学习算法:类似于ML论文中人类设计的代码片段。我们丰富的程序语言将神经网络与其他构建模块(如缓冲区、最近邻模块和自定义丢失函数)结合在一起。我们通过实验证明了该方法的有效性,发现了两种新的好奇心算法,它们在图像输入网格导航、acrobot、lunar lander、ant和hopper等不同领域的性能与人类设计的公开发布的好奇心算法相当,甚至更好。

成为VIP会员查看完整内容
0
27

最新论文

Recently there has been increased interest in using machine learning techniques to improve classical algorithms. In this paper we study when it is possible to construct compact, composable sketches for weighted sampling and statistics estimation according to functions of data frequencies. Such structures are now central components of large-scale data analytics and machine learning pipelines. However, many common functions, such as thresholds and p-th frequency moments with p > 2, are known to require polynomial-size sketches in the worst case. We explore performance beyond the worst case under two different types of assumptions. The first is having access to noisy advice on item frequencies. This continues the line of work of Hsu et al. (ICLR 2019), who assume predictions are provided by a machine learning model. The second is providing guaranteed performance on a restricted class of input frequency distributions that are better aligned with what is observed in practice. This extends the work on heavy hitters under Zipfian distributions in a seminal paper of Charikar et al. (ICALP 2002). Surprisingly, we show analytically and empirically that "in practice" small polylogarithmic-size sketches provide accuracy for "hard" functions.

0
0
下载
预览
参考链接
Top