Existing work on making privacy policies accessible has explored new presentation forms such as color-coding based on the risk factors or summarization to assist users with conscious agreement. To facilitate a more personalized interaction with the policies, in this work, we propose an automated privacy policy question answering assistant that extracts a summary in response to the input user query. This is a challenging task because users articulate their privacy-related questions in a very different language than the legal language of the policy, making it difficult for the system to understand their inquiry. Moreover, existing annotated data in this domain are limited. We address these problems by paraphrasing to bring the style and language of the user's question closer to the language of privacy policies. Our content scoring module uses the existing in-domain data to find relevant information in the policy and incorporates it in a summary. Our pipeline is able to find an answer for 89% of the user queries in the privacyQA dataset.
翻译:为使隐私政策更加个性化地与政策互动,在这项工作中,我们提议一个自动隐私政策回答助理,根据输入用户的询问摘录摘要。这是一项具有挑战性的任务,因为用户用与该政策的法律语言非常不同的语言表达与隐私有关的问题,使系统难以理解他们的查询。此外,这一领域现有的附加说明的数据有限。我们解决这些问题的方法是,通过对用户问题的风格和语言进行引言,使其更接近隐私政策的语言。我们的内容评分模块利用现有内部数据查找该政策中的相关信息并将其纳入一个摘要。我们的管道能够找到在隐私QA数据集中89%的用户查询的答案。