A parallel variant of the Tower of Hanoi Puzzle is described herein. Within this parallel context, two theorems on minimal walks in the state space of configurations, along with their constructive proofs, are provided. These proofs are used to describe a {\sl denoising method}: a method for identifying and eliminating sub-optimal transfers within an arbitrary, valid sequence of disk configurations (as per the rules of the Puzzle). We discuss potential applications of this method to hierarchical reinforcement learning.
翻译:这里描述了河内之谜塔的平行变体。 在此平行的背景下, 提供了两个关于配置空间内最低行走的理论及其建设性证据。 这些证据用来描述一个 ~sl 脱落方法 : 一种在任意、 有效的磁盘配置序列中识别和消除亚最佳转移的方法( 按照谜团的规则 ) 。 我们讨论这一方法在等级强化学习中的潜在应用 。