In this paper we present a deep graph reinforcement learning model to predict and improve the user experience during a live video streaming event, orchestrated by an agent/tracker. We first formulate the user experience prediction problem as a classification task, accounting for the fact that most of the viewers at the beginning of an event have poor quality of experience due to low-bandwidth connections and limited interactions with the tracker. In our model we consider different factors that influence the quality of user experience and train the proposed model on diverse state-action transitions when viewers interact with the tracker. In addition, provided that past events have various user experience characteristics we follow a gradient boosting strategy to compute a global model that learns from different events. Our experiments with three real-world datasets of live video streaming events demonstrate the superiority of the proposed model against several baseline strategies. Moreover, as the majority of the viewers at the beginning of an event has poor experience, we show that our model can significantly increase the number of viewers with high quality experience by at least 75% over the first streaming minutes. Our evaluation datasets and implementation are publicly available at https://publicresearch.z13.web.core.windows.net
翻译:在本文中,我们展示了一个深图强化学习模型,以预测和改进由代理/跟踪者策划的现场视频流活动期间用户的经验。我们首先将用户经验预测问题作为一个分类任务来制定用户经验预测问题,因为大多数在活动开始时的观看者由于低带宽的连接和与跟踪者的互动有限而导致的经验质量差。在我们的模型中,我们考虑到影响用户经验质量的不同因素,并在观看者与跟踪者互动时培训关于不同状态-行动过渡的拟议模式。此外,如果过去的事件具有各种用户经验特点,我们遵循一种梯度加速战略来计算一个从不同事件中学习的全球模型。我们对现场视频流活动的三个真实世界数据集的实验表明,拟议的模型优于若干基线战略。此外,由于在活动开始时的大多数观看者经验差,我们发现我们的模型可以在第一个流中至少增加75%的高质量经验的观众人数。我们的评价数据集和执行在https://publicresressearch.13webscoregres.