We present a novel bilateral negotiation model that allows a self-interested agent to learn how to negotiate over multiple issues in the presence of user preference uncertainty. The model relies upon interpretable strategy templates representing the tactics the agent should employ during the negotiation and learns template parameters to maximize the average utility received over multiple negotiations, thus resulting in optimal bid acceptance and generation. Our model also uses deep reinforcement learning to evaluate threshold utility values, for those tactics that require them, thereby deriving optimal utilities for every environment state. To handle user preference uncertainty, the model relies on a stochastic search to find user model that best agrees with a given partial preference profile. Multi-objective optimization and multi-criteria decision-making methods are applied at negotiation time to generate Pareto-optimal outcomes thereby increasing the number of successful (win-win) negotiations. Rigorous experimental evaluations show that the agent employing our model outperforms the winning agents of the 10th Automated Negotiating Agents Competition (ANAC'19) in terms of individual as well as social-welfare utilities.
翻译:我们提出了一个新的双边谈判模式,让自我感兴趣的代理商在用户偏好不确定的情况下学习如何就多种问题进行谈判。该模式依赖于代表该代理商在谈判期间应当使用的战术的可解释战略模板,并学习模板参数,以最大限度地扩大在多次谈判中获得的平均效用,从而产生最佳的投标接受率和生成率。我们的模式还利用深入强化学习来评价门槛效用价值,以评价那些需要这些价值的战术,从而为每个环境州带来最佳的公用事业。为了处理用户偏好不确定性,该模式依赖于随机搜索,以找到最符合特定部分优惠特征的用户模式。在谈判期间,采用了多目标优化和多标准决策方法,从而产生最佳结果,从而增加成功(双赢)谈判的数量。严格的实验评估表明,使用我们的模型的代理商在个人和社会福利公用事业方面都超过了第10次自动谈判代理商竞争(ANAC'19)的获奖者。