可缩放和分级- 软件- 保护隐私- 保护 K 手段与欺诈侦查应用组合 (Scalable and Sparsity-Aware Privacy-Preserving K-means Clustering with Application to Fraud Detection) - 专知论文

会员服务 ·

0

簇 · 特化 · MoDELS · 稀疏 · INFORMS ·

2022 年 8 月 12 日

Scalable and Sparsity-Aware Privacy-Preserving K-means Clustering with Application to Fraud Detection

翻译：可缩放和分级- 软件- 保护隐私- 保护 K 手段与欺诈侦查应用组合

Yingting Liu,Chaochao Chen,Jamie Cui,Li Wang,Lei Wang

from arxiv, 10 pages, 9 figures

K-means is one of the most widely used clustering models in practice. Due to the problem of data isolation and the requirement for high model performance, how to jointly build practical and secure K-means for multiple parties has become an important topic for many applications in the industry. Existing work on this is mainly of two types. The first type has efficiency advantages, but information leakage raises potential privacy risks. The second type is provable secure but is inefficient and even helpless for the large-scale data sparsity scenario. In this paper, we propose a new framework for efficient sparsity-aware K-means with three characteristics. First, our framework is divided into a data-independent offline phase and a much faster online phase, and the offline phase allows to pre-compute almost all cryptographic operations. Second, we take advantage of the vectorization techniques in both online and offline phases. Third, we adopt a sparse matrix multiplication for the data sparsity scenario to improve efficiency further. We conduct comprehensive experiments on three synthetic datasets and deploy our model in a real-world fraud detection task. Our experimental results show that, compared with the state-of-the-art solution, our model achieves competitive performance in terms of both running time and communication size, especially on sparse datasets.

翻译：K手段是实践中最广泛使用的群集模型之一。由于数据隔离问题和要求高模型性能的要求,如何共同为多个当事方建立实用和安全的K手段已成为该行业许多应用的重要话题。关于这一方面的现有工作主要有两种类型。第一类具有效率优势,但信息渗漏有潜在的隐私风险。第二类是可证实的安全,但对于大规模数据宽度假设则效率低甚至无助。在本文件中,我们提出了高效的Samersity-aware K手段的新框架,有三个特点。首先,我们的框架被分为一个数据依赖离线阶段和更快的在线阶段,而离线阶段允许预先计算几乎所有的加密操作。第二,我们利用在线和离线阶段的传导技术来提高潜在的隐私风险。第三,我们对数据宽度假设采用稀疏的矩阵倍增法,以进一步提高效率。我们在三个合成数据集上进行全面实验,并在现实世界欺诈侦查任务中部署我们的模型。首先,我们的框架分为一个数据依赖离线的离线阶段,并且是一个更快的在线阶段,而离线阶段允许预先计算几乎所有的加密操作。第二,我们在在线阶段中,与最有竞争力的通信,在模型上实现了,在最有竞争力的状态的进度上实现。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

石斑鱼半胱氨酸蛋白酶抑制剂B（CystatinB）在虹彩病毒SGIV感染中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

二元铜硫氧族纳米晶的可控合成及其在聚合物太阳能电池中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于双核Schiff碱配合物固载的MOFs的构筑与催化性能开发

国家自然科学基金

0+阅读 · 2012年12月31日

Catestatin蛋白肽段抑制动脉粥样硬化的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

类普鲁士蓝介晶的合成、形成机理及其热分解为多孔磁性氧化物的研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属纳米晶的可控合成、组装及多功能异质纳米结构的构筑

国家自然科学基金

0+阅读 · 2009年12月31日

纳米晶磁体分层流变对织构发展和热变形能力作用机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

On the Statistical Complexity of Estimation and Testing under Privacy Constraints

Arxiv

0+阅读 · 2022年10月5日

Detection and Evaluation of Clusters within Sequential Data

Arxiv

0+阅读 · 2022年10月4日

Fast Dynamic System Identification with Karhunen-Loève Decomposed Gaussian Processes

Arxiv

0+阅读 · 2022年10月3日

Unsupervised Model Selection for Time-series Anomaly Detection

Arxiv

0+阅读 · 2022年10月3日

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

Arxiv

0+阅读 · 2022年10月3日

Heterogeneous Graph Neural Network for Privacy-Preserving Recommendation

Arxiv

0+阅读 · 2022年10月2日

Privacy-preserving Decentralized Federated Learning over Time-varying Communication Graph

Arxiv

0+阅读 · 2022年10月1日

Scalable Tail Latency Estimation for Data Center Networks

Arxiv

0+阅读 · 2022年9月30日

A Comprehensive Survey on Community Detection with Deep Learning

Arxiv

14+阅读 · 2021年5月26日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

On the Statistical Complexity of Estimation and Testing under Privacy Constraints

Arxiv

0+阅读 · 2022年10月5日

Detection and Evaluation of Clusters within Sequential Data

Arxiv

0+阅读 · 2022年10月4日

Fast Dynamic System Identification with Karhunen-Loève Decomposed Gaussian Processes

Arxiv

0+阅读 · 2022年10月3日

Unsupervised Model Selection for Time-series Anomaly Detection

Arxiv

0+阅读 · 2022年10月3日

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

Arxiv

0+阅读 · 2022年10月3日

Heterogeneous Graph Neural Network for Privacy-Preserving Recommendation

Arxiv

0+阅读 · 2022年10月2日

Privacy-preserving Decentralized Federated Learning over Time-varying Communication Graph

Arxiv

0+阅读 · 2022年10月1日

Scalable Tail Latency Estimation for Data Center Networks

Arxiv

0+阅读 · 2022年9月30日

A Comprehensive Survey on Community Detection with Deep Learning

Arxiv

14+阅读 · 2021年5月26日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

相关基金

石斑鱼半胱氨酸蛋白酶抑制剂B（CystatinB）在虹彩病毒SGIV感染中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

二元铜硫氧族纳米晶的可控合成及其在聚合物太阳能电池中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于双核Schiff碱配合物固载的MOFs的构筑与催化性能开发

国家自然科学基金

0+阅读 · 2012年12月31日

Catestatin蛋白肽段抑制动脉粥样硬化的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

类普鲁士蓝介晶的合成、形成机理及其热分解为多孔磁性氧化物的研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属纳米晶的可控合成、组装及多功能异质纳米结构的构筑

国家自然科学基金

0+阅读 · 2009年12月31日

纳米晶磁体分层流变对织构发展和热变形能力作用机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员