This paper studies the problem of fixing malfunctional 3D objects. While previous works focus on building passive perception models to learn the functionality from static 3D objects, we argue that functionality is reckoned with respect to the physical interactions between the object and the user. Given a malfunctional object, humans can perform mental simulations to reason about its functionality and figure out how to fix it. Inspired by this, we propose FixIt, a dataset that contains about 5k poorly-designed 3D physical objects paired with choices to fix them. To mimic humans' mental simulation process, we present FixNet, a novel framework that seamlessly incorporates perception and physical dynamics. Specifically, FixNet consists of a perception module to extract the structured representation from the 3D point cloud, a physical dynamics prediction module to simulate the results of interactions on 3D objects, and a functionality prediction module to evaluate the functionality and choose the correct fix. Experimental results show that our framework outperforms baseline models by a large margin, and can generalize well to objects with similar interaction types.
翻译:本文研究修复故障 3D 对象的问题。 虽然先前的工作重点是建立被动感知模型, 以学习静态 3D 对象的功能, 但我们认为, 功能是根据物体和用户之间的物理互动来考虑的。 鉴于一个故障对象, 人类可以进行精神模拟, 以解释其功能, 并找出如何修复它。 受此启发, 我们提议 FixIt, 这个数据集包含大约 5 个设计不良的 3D 物理对象, 并配有修复这些物体的选择。 为了模拟人类的精神模拟过程, 我们提出 FixNet, 是一个无缝结合感知和物理动态的新框架。 具体来说, FixNet 包括一个感知模块, 以从 3D 点云中提取结构化的表达方式, 一个物理动态预测模块, 以模拟3D 对象的互动结果, 以及一个功能预测模块, 以评价功能和选择正确的修正。 实验结果显示, 我们的框架大大超越了基线模型, 并且能够将类似互动类型的对象加以概括 。