带有完全单态加密的隐私保护特性选择 (Privacy-Preserving Feature Selection with Fully Homomorphic Encryption)

For the feature selection problem, we propose an efficient privacy-preserving algorithm. Let $D$, $F$, and $C$ be data, feature, and class sets, respectively, where the feature value $x(F_i)$ and the class label $x(C)$ are given for each $x\in D$ and $F_i \in F$. For a triple $(D,F,C)$, the feature selection problem is to find a consistent and minimal subset $F' \subseteq F$, where `consistent' means that, for any $x,y\in D$, $x(C)=y(C)$ if $x(F_i)=y(F_i)$ for $F_i\in F'$, and `minimal' means that any proper subset of $F'$ is no longer consistent. On distributed datasets, we consider feature selection as a privacy-preserving problem: Assume that semi-honest parties $\textsf A$ and $\textsf B$ have their own personal $D_{\textsf A}$ and $D_{\textsf B}$. The goal is to solve the feature selection problem for $D_{\textsf A}\cup D_{\textsf B}$ without revealing their privacy. In this paper, we propose a secure and efficient algorithm based on fully homomorphic encryption, and we implement our algorithm to show its effectiveness for various practical data. The proposed algorithm is the first one that can directly simulate the CWC (Combination of Weakest Components) algorithm on ciphertext, which is one of the best performers for the feature selection problem on the plaintext.

翻译：对于特性选择问题, 我们提出一个高效的隐私保存算法。设置选择问题在于找到一个一致和最小的子集$F\ subsetef F$, 其中“ 固定” 意指任何美元、美元、美元和美元的数据、特点和类组, 其中, 给每个美元( F_ i) 美元和类标签 $( C) 美元美元。对于每美元( D, F, C) 和美元。对于3 美元( D, F, F, C) 来说, 特性选择问题在于找到一个一致和最小的子子集 $F 。如果“ 固定” 意指对于任何美元、 y\ 美元、美元、美元( C) y( C) y( C) 美元, 如果给美元( F) 美元) 和美元( 美元) 类类的特性值值值值值值值, 则“ 最小 ” 表示任何正确的子选择都不再一致。在分发的数据集中, 我们觉得一个半字母缔约方美元和文本的半字母美元和表示美元, 和美元美元自己选择的精精度的精度的精度。

相关内容

特征选择

关注 0

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日