Generating realistic 3D worlds occupied by moving humans has many applications in games, architecture, and synthetic data creation. But generating such scenes is expensive and labor intensive. Recent work generates human poses and motions given a 3D scene. Here, we take the opposite approach and generate 3D indoor scenes given 3D human motion. Such motions can come from archival motion capture or from IMU sensors worn on the body, effectively turning human movement in a "scanner" of the 3D world. Intuitively, human movement indicates the free-space in a room and human contact indicates surfaces or objects that support activities such as sitting, lying or touching. We propose MIME (Mining Interaction and Movement to infer 3D Environments), which is a generative model of indoor scenes that produces furniture layouts that are consistent with the human movement. MIME uses an auto-regressive transformer architecture that takes the already generated objects in the scene as well as the human motion as input, and outputs the next plausible object. To train MIME, we build a dataset by populating the 3D FRONT scene dataset with 3D humans. Our experiments show that MIME produces more diverse and plausible 3D scenes than a recent generative scene method that does not know about human movement. Code and data will be available for research at https://mime.is.tue.mpg.de.
翻译:通过移动人类所占据的现实的 3D 世界在游戏、建筑和合成数据创造中有许多应用。 但是生成这样的场景是昂贵的, 耗费大量人力。 最近的工作产生了人类的姿势和动作。 在这里, 我们采取相反的方法, 产生3D人类运动的3D室内场景。 这种运动可以来自档案动作捕获或身体上穿戴的IMU传感器, 有效地将人类运动变成3D世界的“ 扫描器” 。 直觉地说, 人类运动表明在房间里的自由空间和人类接触显示支持诸如坐坐、躺或触等活动的表面或物体。 我们建议 MIME( 移动与移动以3D环境相推 ) 。 我们建议 MIME( 移动和运动以3D 环境相推 ), 这是产生家具布局的基因模型模型模型 。 MIME 使用自动反动的变压器结构, 将已经生成的物体和人类运动作为输入, 并输出下一个合理的物体 。 要训练 MIME, 我们建立一个数据集, 将3D 与 3D 人类的基因模型相比, 我们的模型将更能显示 。