A prominent approach to build datasets for training task-oriented bots is crowd-based paraphrasing. Current approaches, however, assume the crowd would naturally provide diverse paraphrases or focus only on lexical diversity. In this WiP we addressed an overlooked aspect of diversity, introducing an approach for guiding the crowdsourcing process towards paraphrases that are syntactically diverse.
翻译:为培训面向任务的机器人建立数据集的突出方法就是以人群为基础的参数。 但是,目前的方法假设人群自然会提供不同的参数或只关注词汇多样性。 在这个WiP中,我们处理了一个被忽视的多样化问题,引入了一种方法来引导众包进程走向一成不变的参数。