User dissatisfaction due to buffering pauses during streaming is a significant cost to the system, which we model as a non-decreasing function of the frequency of buffering pause. Minimization of total user dissatisfaction in a multi-channel cellular network leads to a non-convex problem. Utilizing a combinatorial structure in this problem, we first propose a polynomial time joint admission control and channel allocation algorithm which is provably (almost) optimal. This scheme assumes that the base station (BS) knows the frame statistics of the streams. In a more practical setting, where these statistics are not available a priori at the BS, a learning based scheme with provable guarantees is developed. This learning based scheme has relation to regret minimization in multi-armed bandits with non-i.i.d. and delayed reward (cost). All these algorithms require none to minimal feedback from the user equipment to the base station regarding the states of the media player buffer at the application layer, and hence, are of practical interest.
翻译:用户在流流过程中缓冲暂停的不满对系统来说是一个巨大的成本, 我们将其建模为缓冲暂停频率的非降序功能。 在多通道蜂窝网络中最大限度地减少用户的不满导致一个非凝固问题。 在这个问题中使用组合结构时, 我们首先提出一个可以( 几乎) 最佳的多米时间联合进入控制和频道分配算法。 这个方案假设基地站( BS) 了解流流的框架统计。 在更实际的环境下, 在 BS 上没有这些统计数据的先验功能, 开发了一个基于可实现的保证的学习计划。 这个基于学习的计划关系到对非i. id. 和延迟奖励( 成本) 的多武装匪徒进行最小化的遗憾。 所有这些算法都不需要用户设备向基地站提供最小的关于应用层媒体玩家缓冲状态的反馈, 因此, 具有实际意义。