We present a unified formalism for structure discovery of causal models and predictive state representation (PSR) models in reinforcement learning (RL) using higher-order category theory. Specifically, we model structure discovery in both settings using simplicial objects, contravariant functors from the category of ordinal numbers into any category. Fragments of causal models that are equivalent under conditional independence -- defined as causal horns -- as well as subsequences of potential tests in a predictive state representation -- defined as predictive horns -- are both special cases of horns of a simplicial object, subsets resulting from the removal of the interior and the face opposite a particular vertex. Latent structure discovery in both settings involve the same fundamental mathematical problem of finding extensions of horns of simplicial objects through solving lifting problems in commutative diagrams, and exploiting weak homotopies that define higher-order symmetries. Solutions to the problem of filling "inner" vs "outer" horns leads to various notions of higher-order categories, including weak Kan complexes and quasicategories. We define the abstract problem of structure discovery in both settings in terms of adjoint functors between the category of universal causal models or universal decision models and its simplicial object representation.
翻译:我们展示了一种统一的形式主义,用于在结构上发现因果模型和预测性国家代表性模型(PSR)中采用高阶类别理论进行强化学习(RL)的结构性发现。具体地说,我们用简易物体、正方数字类别中的反异式衍生物进入任何类别来模拟两种环境的发现。在有条件独立 -- -- 定义为因果角 -- -- 以及预测性国家代表性(定义为预测性角)中潜在测试的后继后果 -- -- 两种情况都是简易物体角的特殊案例,这些是内面和面部去除后产生的子,两种环境的内形结构发现都涉及同样的基本数学问题,即通过解决杂物图中的提升问题来寻找简化物体角的扩展物,以及利用确定更高阶次对等的薄弱的同性结构。填补“内向”和“外向”角的问题,导致各种更高阶级类别的不同概念,包括较弱的卡尼复合体和准类次类别物体。两种环境中的内形结构的内形结构都涉及通过解决杂质图中的问题,或者在模型中的抽象性共同性结论性研究结构结构中界定。