匿名分析临床数据集 (Anonymously Analyzing Clinical Datasets)

This paper takes on the problem of automatically identifying clinically-relevant patterns in medical datasets without compromising patient privacy. To achieve this goal, we treat datasets as a black box for both internal and external users of data that lets us handle clinical data queries directly and far more efficiently. The novelty of the approach lies in avoiding the data de-identification process often used as a means of preserving patient privacy. The implemented toolkit combines software engineering technologies such as Java EE and RESTful web services, to allow exchanging medical data in an unidentifiable XML format as well as restricting users to the need-to-know principle. Our technique also inhibits retrospective processing of data, such as attacks by an adversary on a medical dataset using advanced computational methods to reveal Protected Health Information (PHI). The approach is validated on an endoscopic reporting application based on openEHR and MST standards. From the usability perspective, the approach can be used to query datasets by clinical researchers, governmental or non-governmental organizations in monitoring health care services to improve quality of care.

翻译：本文探讨了在不损害患者隐私的情况下自动识别医疗数据集中与临床相关的模式的问题。为了实现这一目标,我们把数据集作为数据内部和外部用户的黑盒,以便直接和更有效地处理临床数据查询。这一方法的新颖之处在于避免经常作为保护患者隐私手段使用的数据去身份识别程序。实施的工具包结合了诸如Java EEE和REST型网络服务等软件工程技术,允许以无法识别的XML格式交换医疗数据,并将用户限制在需要了解的原则之下。我们的技术还禁止追溯性处理数据,例如对手利用先进的计算方法攻击医疗数据集以披露保护健康信息。这种方法在基于开放EHR和MST标准的底部报告应用程序上得到验证。从可用性角度来说,这种方法可以用来查询临床研究人员、政府组织或非政府组织在监测保健服务质量方面的数据集。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日