Reinforcement learning is finding its way to real-world problem application, transferring from simulated environments to physical setups. In this work, we implement vision-based alignment of an optical Mach-Zehnder interferometer with a confocal telescope in one arm, which controls the diameter and divergence of the corresponding beam. We use a continuous action space; exponential scaling enables us to handle actions within a range of over two orders of magnitude. Our agent trains only in a simulated environment with domain randomizations. In an experimental evaluation, the agent significantly outperforms an existing solution and a human expert.
翻译:强化学习正在找到通往现实世界问题应用的途径,从模拟环境向物理设置转变。在这项工作中,我们实施了光学马赫-泽赫德干涉仪与一个臂中控制相应光束直径和偏差的凝固望远镜基于视觉的对齐。我们使用一个连续的行动空间;指数缩放使我们能够处理两个数量级以上范围内的行动。我们的代理器只在模拟环境中用域随机化进行训练。在一项实验性评估中,该代理器大大超过现有解决方案和一位人类专家。