As more and more companies store their customers' data; various information of a person is distributed among numerous companies' databases. Different industrial sectors carry distinct features about the same customers. Also, different companies within the same industrial sector carry similar kinds of data about the customers with different data representations. Cooperation between companies from different industrial sectors, called vertical cooperation, and between the companies within the same sector, called horizontal cooperation, can lead to more accurate machine learning models and better estimations in tasks such as credit scoring. However, data privacy regulations and compatibility issues for different data representations are huge obstacles to cooperative model training. By proposing the training framework MICS and experimentation on several numerical data sets, we showed that companies would have an incentive to cooperate with other companies from their sector and with other industrial sectors to jointly train more robust and accurate global models without explicitly sharing their customers' private data.
翻译:随着越来越多的公司储存其客户的数据;一个人的各种信息在众多公司的数据库中分布;不同的工业部门具有与同一客户不同的特征;此外,同一工业部门的不同公司也拥有关于具有不同数据代表的客户的类似数据;不同工业部门的公司之间的合作,称为纵向合作,以及同一部门内的公司之间的合作,称为横向合作,可导致更准确的机器学习模式和更好地估计信用评分等任务;然而,数据隐私条例和不同数据表述的兼容性问题是合作模式培训的巨大障碍;通过提出培训框架和尝试数套数字数据集,我们表明,公司将具有与本部门其他公司和其他工业部门合作的动力,以便联合培训更有力和准确的全球模型,而不必明确分享其客户的私人数据。