This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions and arising from possibly different distributions. In the context of social learning, several useful strategies have been developed, which solve decision making problems through local cooperation across distributed agents and allow them to learn from streaming data. However, traditional social learning strategies rely on the fundamental assumption that each agent has significant prior knowledge of the underlying distribution of the observations. In this work we overcome this issue by introducing a machine learning framework that exploits social interactions over a graph, leading to a fully data-driven solution to the distributed classification problem. In the proposed social machine learning (SML) strategy, two phases are present: in the training phase, classifiers are independently trained to generate a belief over a set of hypotheses using a finite number of training samples; in the prediction phase, classifiers evaluate streaming unlabeled observations and share their instantaneous beliefs with neighboring classifiers. We show that the SML strategy enables the agents to learn consistently under this highly-heterogeneous setting and allows the network to continue learning even during the prediction phase when it is deciding on unlabeled samples. The prediction decisions are used to continually improve performance thereafter in a manner that is markedly different from most existing static classification schemes where, following training, the decisions on unlabeled data are not re-used to improve future performance.
翻译:在社会学习方面,已经制定了若干有用的战略,通过分布式代理商之间的当地合作解决决策问题,并使他们能够从流数据中学习。然而,传统的社会学习战略依赖于基本假设,即每个代理商事先对观测基本分布拥有大量基本知识。在这项工作中,我们通过引入一个机器学习框架,利用图表进行社会互动,从而导致对分布式分类问题的完全由数据驱动的解决方案,克服了这一问题。在拟议的社会机器学习(SML)战略中,存在两个阶段:在培训阶段,对叙级员进行独立培训,以便利用一定数量的培训样本对一套假冒产生信心;在预测阶段,叙级员评估流出未贴标签的观测,并与邻近的叙级员分享其瞬间信仰。在这项工作中,我们表明SML战略使代理商能够在这种高度异质的设置下不断学习,并允许网络在预测阶段继续学习,即使它决定的是未加贴标签的进度,在以后,预测决定是如何持续改进的,在不作固定的标定式的标本上,在以后,预测决定是如何持续使用。