便利使用和有力办法,使临床文本文件有区别地私下取消识别 (An Easy-to-use and Robust Approach for the Differentially Private De-Identification of Clinical Textual Documents) - 专知论文

会员服务 ·

0

稳健性 · 置换 · 命名实体识别 · Unstructured · 可辨认的 ·

2022 年 11 月 2 日

An Easy-to-use and Robust Approach for the Differentially Private De-Identification of Clinical Textual Documents

翻译：便利使用和有力办法,使临床文本文件有区别地私下取消识别

Yakini Tchouka,Jean-François Couchot,David Laiymani

Unstructured textual data is at the heart of healthcare systems. For obvious privacy reasons, these documents are not accessible to researchers as long as they contain personally identifiable information. One way to share this data while respecting the legislative framework (notably GDPR or HIPAA) is, within the medical structures, to de-identify it, i.e. to detect the personal information of a person through a Named Entity Recognition (NER) system and then replacing it to make it very difficult to associate the document with the person. The challenge is having reliable NER and substitution tools without compromising confidentiality and consistency in the document. Most of the conducted research focuses on English medical documents with coarse substitutions by not benefiting from advances in privacy. This paper shows how an efficient and differentially private de-identification approach can be achieved by strengthening the less robust de-identification method and by adapting state-of-the-art differentially private mechanisms for substitution purposes. The result is an approach for de-identifying clinical documents in French language, but also generalizable to other languages and whose robustness is mathematically proven.

翻译：由于明显的隐私原因,研究人员无法查阅这些文件,只要这些文件包含个人可识别的信息。在尊重立法框架(特别是GDPR或HIPAA)的同时分享这些数据的一个方法就是在医疗结构内,通过命名实体识别系统(NER)检测个人个人信息,然后取而代之,使其很难与个人联系起来。挑战在于是否有可靠的NER和替代工具,同时又不损害文件的保密性和一致性。所进行的研究大多侧重于英文医疗文件,其粗略的替代方法是不从隐私进步中受益的。本文说明了如何通过加强较不健全的非识别方法和为替代目的调整最先进的个人机制,从而实现高效和差别化的私人身份识别方法。其结果是用法语解辨临床文件,但也可推广到其他语言,其稳健性得到了数学的证明。

0

相关内容

稳健性

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

专知会员服务

17+阅读 · 2022年3月19日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Angiopep-2介导的共载siDNA-DBT/卡莫司汀的双靶向纳米复合物增强脑胶质瘤放化疗作用效果及其机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

Ghrelin对老年性骨骼肌肉减少症的作用及分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

T-bet在较低危骨髓增生异常综合征骨髓衰竭发病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

长链非编码RNA HOTTIP参与小细胞肺癌耐药的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效率单体共时双频Doherty功率放大器设计及其预失真行为模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多核机群的Petri网并行算法的研究与实现

国家自然科学基金

0+阅读 · 2011年12月31日

Al2O3和TiOx在CaO-CaF2-SiO2渣系的热力学研究

国家自然科学基金

0+阅读 · 2011年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Confounding-adjustment methods for the difference in medians

Arxiv

0+阅读 · 2022年12月22日

A localized reduced basis approach for unfitted domain methods on parameterized geometries

Arxiv

0+阅读 · 2022年12月22日

A literature review on different types of empirically evaluated bug localization approaches

Arxiv

0+阅读 · 2022年12月22日

Federated Learning -- Methods, Applications and beyond

Federated Learning -- Methods, Applications and beyond

Arxiv

0+阅读 · 2022年12月22日

kalis: A Modern Implementation of the Li & Stephens Model for Local Ancestry Inference in R

Arxiv

0+阅读 · 2022年12月21日

Private Data Valuation and Fair Payment in Data Marketplaces

Arxiv

0+阅读 · 2022年12月21日

Duration of and time to response in oncology clinical trials from the perspective of the estimand framework

Arxiv

0+阅读 · 2022年12月21日

FederBoost: Private Federated Learning for GBDT

Arxiv

0+阅读 · 2022年12月21日

Differentially Private Decentralized Optimization with Relay Communication

Arxiv

0+阅读 · 2022年12月21日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

VIP会员

文章信息

相关主题

命名实体识别

相关VIP内容

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

专知会员服务

17+阅读 · 2022年3月19日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

数据智能体综述：新兴范式还是被高估的炒作？

海底战已至：美国构思海底安全战略 | 最新报告

【ICCV2025教程】视觉异常检测中的基础模型：进展、挑战与应用

美军将无人自主等新技术融入潜艇部队以更具杀伤力

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Confounding-adjustment methods for the difference in medians

Arxiv

0+阅读 · 2022年12月22日

A localized reduced basis approach for unfitted domain methods on parameterized geometries

Arxiv

0+阅读 · 2022年12月22日

A literature review on different types of empirically evaluated bug localization approaches

Arxiv

0+阅读 · 2022年12月22日

Federated Learning -- Methods, Applications and beyond

Federated Learning -- Methods, Applications and beyond

Arxiv

0+阅读 · 2022年12月22日

kalis: A Modern Implementation of the Li & Stephens Model for Local Ancestry Inference in R

Arxiv

0+阅读 · 2022年12月21日

Private Data Valuation and Fair Payment in Data Marketplaces

Arxiv

0+阅读 · 2022年12月21日

Duration of and time to response in oncology clinical trials from the perspective of the estimand framework

Arxiv

0+阅读 · 2022年12月21日

FederBoost: Private Federated Learning for GBDT

Arxiv

0+阅读 · 2022年12月21日

Differentially Private Decentralized Optimization with Relay Communication

Arxiv

0+阅读 · 2022年12月21日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

相关基金

Angiopep-2介导的共载siDNA-DBT/卡莫司汀的双靶向纳米复合物增强脑胶质瘤放化疗作用效果及其机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

Ghrelin对老年性骨骼肌肉减少症的作用及分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

T-bet在较低危骨髓增生异常综合征骨髓衰竭发病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

长链非编码RNA HOTTIP参与小细胞肺癌耐药的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效率单体共时双频Doherty功率放大器设计及其预失真行为模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多核机群的Petri网并行算法的研究与实现

国家自然科学基金

0+阅读 · 2011年12月31日

Al2O3和TiOx在CaO-CaF2-SiO2渣系的热力学研究

国家自然科学基金

0+阅读 · 2011年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员