Spectrum sharing among users is a fundamental problem in the management of any wireless network. In this paper, we discuss the problem of distributed spectrum collaboration without central management under general unknown channels. Since the cost of communication, coordination and control is rapidly increasing with the number of devices and the expanding bandwidth used there is an obvious need to develop distributed techniques for spectrum collaboration where no explicit signaling is used. In this paper, we combine game-theoretic insights with deep Q-learning to provide a novel asymptotically optimal solution to the spectrum collaboration problem. We propose a deterministic distributed deep reinforcement learning(D3RL) mechanism using a deep Q-network (DQN). It chooses the channels using the Q-values and the channel loads while limiting the options available to the user to a few channels with the highest Q-values and among those, it selects the least loaded channel. Using insights from both game theory and combinatorial optimization we show that this technique is asymptotically optimal for large overloaded networks. The selected channel and the outcome of the successful transmission are fed back into the learning of the deep Q-network to incorporate it into the learning of the Q-values. We also analyzed performance to understand the behavior of D3RL in differ
翻译:用户之间共享频谱是管理任何无线网络的根本问题。 在本文中, 我们讨论在一般未知的频道下, 没有中央管理, 分散频谱合作的问题。 由于通信、 协调和控制的成本随着设备的数量和使用的频带的扩大而迅速增加, 显然需要开发分布式频谱合作技术, 而没有使用明确的信号。 在本文中, 我们将游戏理论和深Q学习结合起来, 以提供一种新颖的无源最佳解决方案来解决频谱合作问题。 我们建议使用深Q网络( DQN), 建立一个确定性分布式的深度强化学习( D3RLL) 机制。 它选择了使用Q值和频道负荷的频道, 同时将用户可用的选项限制在几个没有使用明确信号的频道。 在本文中, 我们使用游戏理论和组合优化的洞洞洞的洞见, 我们显示, 这种技术对于大型超载网络来说是同样最优化的。 选中的频道和成功传输结果被反馈回到深Q- 学习深度行为变变的DL 。