A deep reinforcement learning approach is applied, for the first time, to solve the routing, modulation, spectrum and core allocation (RMSCA) problem in dynamic multicore fiber elastic optical networks (MCF-EONs). To do so, a new environment - compatible with OpenAI's Gym - was designed and implemented to emulate the operation of MCF-EONs. The new environment processes the agent actions (selection of route, core and spectrum slot) by considering the network state and physical-layer-related aspects. The latter includes the available modulation formats and their reach and the inter-core crosstalk (XT), an MCF-related impairment. If the resulting quality of the signal is acceptable, the environment allocates the resources selected by the agent. After processing the agent's action, the environment is configured to give the agent a numerical reward and information about the new network state. The blocking performance of four different agents was compared through simulation to 3 baseline heuristics used in MCF-EONs. Results obtained for the NSFNet and COST239 network topologies show that the best-performing agent achieves, on average, up to a four-times decrease in blocking probability concerning the best-performing baseline heuristic methods.
翻译:首次采用了深入强化学习方法,以解决动态多核心纤维弹性光学网络(MCF-EONs)中的路由、调控、频谱和核心分配问题。为此,设计并实施了与OpenAI的健身房相兼容的新环境,以效仿MCF-ENS的运行。新的环境通过考虑网络状态和物理层相关方面,处理代理行动(选择路线、核心和频段槽),从而处理4种不同代理器的阻塞性能,模拟到MCF-EONs使用的3种基线超光度。NSFNet和COST239网络顶层获得的结果显示,如果所生成的信号质量可以接受,则环境分配代理器选定的资源。在加工该代理器的动作后,环境的配置是为了向代理器提供数字奖励和新网络状态的信息。通过模拟将4种不同代理器的性能阻塞性能与MCFF-EONs使用的3种基线超度作比较。NSFNet和COST-239网络顶部间交叉跟踪(XT),即与MSFNet239顶部位结构有关受损。如果信号的质量可以接受,那么,那么,那么,则环境分配最佳的概率率率率将达到4,则显示其最差差率率率。