In this paper, we propose a novel deep reinforcement learning framework to maximize user fairness in terms of delay. To this end, we devise a new version of the modified largest weighted delay first (M-LWDF) algorithm, which is called $\beta$-M-LWDF, aiming to fulfill an appropriate balance between user fairness and average delay. This balance is defined as a feasible region on the cumulative distribution function (CDF) of the user delay that allows identifying unfair states, feasible-fair states, and over-fair states. Simulation results reveal that our proposed framework outperforms traditional resource allocation techniques in terms of latency fairness and average delay
翻译:在本文中,我们提出了一个新的深层强化学习框架,以在延误方面最大限度地提高用户的公平性。为此,我们设计了一个新版本的经修改的最大加权延迟算法(M-LWDF),名为$\beta$-M-LWDF,旨在在用户公平性和平均延迟之间实现适当的平衡。这一平衡被定义为在用户延迟的累积分配功能(CDF)上的一个可行区域,可以识别不公平的国家、可行的公平国家和过度公平的国家。模拟结果表明,我们提议的框架在延缓和平均拖延方面优于传统的资源分配技术。