As edge devices become increasingly powerful, data analytics are gradually moving from a centralized to a decentralized regime where edge compute resources are exploited to process more of the data locally. This regime of analytics is coined as federated data analytics (FDA). In spite of the recent success stories of FDA, most literature focuses exclusively on deep neural networks. In this work, we take a step back to develop an FDA treatment for one of the most fundamental statistical models: linear regression. Our treatment is built upon hierarchical modeling that allows borrowing strength across multiple groups. To this end, we propose two federated hierarchical model structures that provide a shared representation across devices to facilitate information sharing. Notably, our proposed frameworks are capable of providing uncertainty quantification, variable selection, hypothesis testing and fast adaptation to new unseen data. We validate our methods on a range of real-life applications including condition monitoring for aircraft engines. The results show that our FDA treatment for linear models can serve as a competing benchmark model for future development of federated algorithms.
翻译:随着边缘装置越来越强大,数据分析正在逐渐从中央集权制度向分散化制度转变,边际计算资源利用边际计算,在当地处理更多的数据。这种分析制度是作为联合数据分析(FDA)建立的。尽管林业发展局最近取得了一些成功,但大多数文献都只侧重于深神经网络。在这项工作中,我们退后一步,为林业发展局的最基本统计模型之一:线性回归模型开发一种处理方法。我们的处理方法建立在分级模型的基础上,允许跨多个组群的借款力。为此,我们提议了两个联合的等级模型结构,在各种设备之间提供一个共享的代表性,以促进信息共享。值得注意的是,我们提议的框架能够提供不确定性的量化、变量选择、假设测试和对新看不见数据的快速适应。我们验证了我们一系列实际应用的方法,包括飞机引擎的状况监测。结果显示,我们林业发展局对线性模型的处理可以作为今后开发联式算法的相互竞争的基准模型。