In this work, we investigate the problem of revealing the functionality of a black-box agent. Notably, we are interested in the interpretable and formal description of the behavior of such an agent. Ideally, this description would take the form of a program written in a high-level language. This task is also known as reverse engineering and plays a pivotal role in software engineering, computer security, but also most recently in interpretability. In contrast to prior work, we do not rely on privileged information on the black box, but rather investigate the problem under a weaker assumption of having only access to inputs and outputs of the program. We approach this problem by iteratively refining a candidate set using a generative neural program synthesis approach until we arrive at a functionally equivalent program. We assess the performance of our approach on the Karel dataset. Our results show that the proposed approach outperforms the state-of-the-art on this challenge by finding an approximately functional equivalent program in 78% of cases -- even exceeding prior work that had privileged information on the black-box.
翻译:在这项工作中,我们调查了披露黑盒子代理器功能的问题。 值得注意的是, 我们关心对黑盒子代理器行为的解释性和正式描述。 理想的情况是, 这种描述将采用高层次语言编写的程序形式。 这个任务也被称为反向工程,在软件工程、计算机安全、以及最近的可解释性方面发挥着关键作用。 与先前的工作不同, 我们并不依赖黑盒子上的特许信息, 而是在仅能获取程序投入和产出的假设较弱的情况下调查问题。 我们通过利用基因神经程序合成方法迭接地完善一个候选人组来解决这一问题, 直到我们达成一个功能等同的方案。 我们评估了我们在Karel数据集上的做法的绩效。 我们的结果表明,拟议方法在78%的案件中找到一个功能上相当的方案,甚至超过了先前在黑盒子上拥有保密信息的工作。