采用自动系绳网系统的普遍碎片捕获学习强有力政策 (Learning Robust Policies for Generalized Debris Capture with an Automated Tether-Net System)

from arxiv, This paper has been presented at the 2022 AIAA SciTech Forum and Exposition, and accepted for publication in the corresponding AIAA proceedings

Tether-net launched from a chaser spacecraft provides a promising method to capture and dispose of large space debris in orbit. This tether-net system is subject to several sources of uncertainty in sensing and actuation that affect the performance of its net launch and closing control. Earlier reliability-based optimization approaches to design control actions however remain challenging and computationally prohibitive to generalize over varying launch scenarios and target (debris) state relative to the chaser. To search for a general and reliable control policy, this paper presents a reinforcement learning framework that integrates a proximal policy optimization (PPO2) approach with net dynamics simulations. The latter allows evaluating the episodes of net-based target capture, and estimate the capture quality index that serves as the reward feedback to PPO2. Here, the learned policy is designed to model the timing of the net closing action based on the state of the moving net and the target, under any given launch scenario. A stochastic state transition model is considered in order to incorporate synthetic uncertainties in state estimation and launch actuation. Along with notable reward improvement during training, the trained policy demonstrates capture performance (over a wide range of launch/target scenarios) that is close to that obtained with reliability-based optimization run over an individual scenario.

翻译：从追星航天器上发射的绳子网提供了一种捕捉和处理轨道中大型空间碎片的有希望的方法。系绳网系统在影响其净发射和封闭控制性能的感测和激活方面受到若干不确定因素的影响,在设计控制行动的早期基于可靠性的优化方法方面仍然具有挑战性,在计算上难以比拟与追星者相比的不同发射情景和目标(碎片)状态。为寻求一项普遍和可靠的控制政策,本文件提出了一个强化学习框架,将准政策优化(PPPO2)方法与净动态模拟结合起来。后者允许评估基于净目标的捕捉过程,并估计作为给PPPO2的奖励反馈的捕捉质量指数。在这里,所学的政策旨在根据任何特定发射设想的网络和目标(碎片)状况来模拟净关闭行动的时机。为了将合成的不确定性纳入国家估计和启动行动。除了在培训期间显著的奖励性改进外,经过培训的政策还展示了业绩(在广泛的发射/目标假设情景上),并近近近于单个的可靠度。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日