虚拟神经网络半现实风险最小化 (Semi-Counterfactual Risk Minimization Via Neural Networks)

Counterfactual risk minimization is a framework for offline policy optimization with logged data which consists of context, action, propensity score, and reward for each sample point. In this work, we build on this framework and propose a learning method for settings where the rewards for some samples are not observed, and so the logged data consists of a subset of samples with unknown rewards and a subset of samples with known rewards. This setting arises in many application domains, including advertising and healthcare. While reward feedback is missing for some samples, it is possible to leverage the unknown-reward samples in order to minimize the risk, and we refer to this setting as semi-counterfactual risk minimization. To approach this kind of learning problem, we derive new upper bounds on the true risk under the inverse propensity score estimator. We then build upon these bounds to propose a regularized counterfactual risk minimization method, where the regularization term is based on the logged unknown-rewards dataset only; hence it is reward-independent. We also propose another algorithm based on generating pseudo-rewards for the logged unknown-rewards dataset. Experimental results with neural networks and benchmark datasets indicate that these algorithms can leverage the logged unknown-rewards dataset besides the logged known-reward dataset.

翻译：事实风险最小化是使用记录数据实现离线政策优化的框架, 包括背景、行动、倾向性评分和对每个抽样点的奖励。在这项工作中, 我们以这个框架为基础, 并为一些样本的奖赏没有被观察到的设置提出学习方法, 因此登录数据由一组样本组成, 这些样本有未知的奖赏和一组已知的奖赏。这个设置出现在许多应用领域, 包括广告和医疗保健。虽然一些样本缺少奖励反馈, 但有可能利用未知的奖赏样本来尽量减少风险, 我们将此设置称为半反向风险最小化。为了处理这种学习问题, 我们从反向偏向偏向评分的估量下的真实风险中获取新的上限。然后我们利用这些框来提出一个正规化的反事实风险最小化方法, 包括广告和医疗保健。这个规范术语仅基于登录的未知的奖赏数据集, 因此它取决于奖赏性。我们还提议另一种算法, 以生成虚假的反向向向向上记录的风险最小化的风险最小化的半反向最小化风险最小化的半反向风险最小化的数据最小化的数据。实验性数据比值比值。我们用这些未知的实验性数据序列显示的对未知的数据比值。

相关内容

Neural Networks

关注 1649

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日