In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency. The teacher adjusts her teaching method for different students, and the student, after getting familiar with the teacher's instruction mechanism, can infer the teacher's intention to learn faster. Recently, the benefits of integrating this cooperative pedagogy into machine concept learning in discrete spaces have been proved by multiple works. However, how cooperative pedagogy can facilitate machine parameter learning hasn't been thoroughly studied. In this paper, we propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function and learn provably faster compared with the naive learning algorithms used in previous machine teaching works. We give theoretical proof that the iterative teacher-aware learning (ITAL) process leads to local and global improvements. We then validate our algorithms with extensive experiments on various tasks including regression, classification, and inverse reinforcement learning using synthetic and real data. We also show the advantage of modeling teacher-awareness when agents are learning from human teachers.
翻译:在人类教学中,教师和学生可以适应性地互动,以最大限度地提高通信效率。教师调整其针对不同学生的教学方法,学生在熟悉教师教学机制之后,可以推断教师学习速度更快的意图。最近,将这一合作教学方法纳入离散空间的机器概念学习的好处已经通过多项工作得到证明。然而,合作教学方法如何促进机器参数学习还没有经过彻底研究。在本文中,我们提议建立一个基于梯度优化的师资学习者,该学习者可以将教师的合作意图纳入概率函数,并比以往机器教学工作中使用的天真学习算法更快地学习。我们提供了理论证据,证明迭接的师资学习(ITAL)过程可以导致本地和全球的改进。我们随后通过对各种任务进行广泛的实验来验证我们的算法,包括回归、分类以及利用合成数据和真实数据进行反向强化学习。我们还表明,在代理人员向人类教师学习时,模拟师资认识的优势。