采用多党互动学习以自动决策决策的个性化次最佳行动建议 (Personalized next-best action recommendation with multi-party interaction learning for automated decision-making)

Automated next-best action recommendation for each customer in a sequential, dynamic and interactive context has been widely needed in natural, social and business decision-making. Personalized next-best action recommendation must involve past, current and future customer demographics and circumstances (states) and behaviors, long-range sequential interactions between customers and decision-makers, multi-sequence interactions between states, behaviors and actions, and their reactions to their counterpart's actions. No existing modeling theories and tools, including Markovian decision processes, user and behavior modeling, deep sequential modeling, and personalized sequential recommendation, can quantify such complex decision-making on a personal level. We take a data-driven approach to learn the next-best actions for personalized decision-making by a reinforced coupled recurrent neural network (CRN). CRN represents multiple coupled dynamic sequences of a customer's historical and current states, responses to decision-makers' actions, decision rewards to actions, and learns long-term multi-sequence interactions between parties (customer and decision-maker). Next-best actions are then recommended on each customer at a time point to change their state for an optimal decision-making objective. Our study demonstrates the potential of personalized deep learning of multi-sequence interactions and automated dynamic intervention for personalized decision-making in complex systems.

翻译：在自然、社会和商业决策中,普遍需要为每个客户提供顺序、动态和互动的下一个最佳行动建议。个性化的下一个最佳行动建议必须涉及过去、现在和未来的客户人口和情况(状况)和行为,客户和决策者之间的长期相继互动,国家、行为和行动之间的多序列互动,以及他们对对应方行动的反应。任何现有的示范理论和工具,包括Markovian决策过程、用户和行为模拟、用户和行为建模、深层次相继建模和个性化的顺序建议,都无法量化这种个人层面的复杂决策。我们采取数据驱动方法,学习个人化决策的下一个最佳行动,通过强化的、连锁的神经网络(CRN)来学习个人化决策的下一个最佳行动。 CRN代表客户历史和当前状态的多重动态序列、对决策者行动的反应、对行动的决策奖励,以及各方(客户和决策者)之间长期的多序列互动。然后,建议每个客户在某个时间点上采取下一个最佳行动,以改变个人化决策状态,通过强化的同步的同步的神经网络(CRN)来展示个人自主决策互动的复杂、个人决策系统。