无入侵者,无有效性:隐私保护文本匿名评价标准 (No Intruder, no Validity: Evaluation Criteria for Privacy-Preserving Text Anonymization)

For sensitive text data to be shared among NLP researchers and practitioners, shared documents need to comply with data protection and privacy laws. There is hence a growing interest in automated approaches for text anonymization. However, measuring such methods' performance is challenging: missing a single identifying attribute can reveal an individual's identity. In this paper, we draw attention to this problem and argue that researchers and practitioners developing automated text anonymization systems should carefully assess whether their evaluation methods truly reflect the system's ability to protect individuals from being re-identified. We then propose TILD, a set of evaluation criteria that comprises an anonymization method's technical performance, the information loss resulting from its anonymization, and the human ability to de-anonymize redacted documents. These criteria may facilitate progress towards a standardized way for measuring anonymization performance.

翻译：为使国家地名方案研究人员和从业人员共享敏感文本数据,共享文件需要遵守数据保护和隐私法,因此对自动文本匿名办法的兴趣日益浓厚,然而,衡量这种方法的性能具有挑战性:缺少单一识别属性可以揭示个人身份。在本文件中,我们提请注意这一问题,并主张开发自动文本匿名系统的研究人员和从业人员应认真评估其评价方法是否真正反映了该系统保护个人不被重新识别的能力。我们随后提议了一套评价标准TILD,其中包括匿名方法的技术性能、因匿名而产生的信息损失,以及重新编造文件的人去匿名能力。这些标准可能有助于在采用标准化方法衡量匿名性能方面取得进展。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【AAAI2021】长文本的上下文推理

专知会员服务

14+阅读 · 2021年1月18日

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日