Large-scale data analysis is growing at an exponential rate as data proliferates in our societies. This abundance of data has the advantage of allowing the decision-maker to implement complex models in scenarios that were prohibitive before. At the same time, such an amount of data requires a distributed thinking approach. In fact, Deep Learning models require plenty of resources, and distributed training is needed. This paper presents a Multicriteria approach for distributed learning. Our approach uses the Weighted Goal Programming approach in its Chebyshev formulation to build an ensemble of decision rules that optimize aprioristically defined performance metrics. Such a formulation is beneficial because it is both model and metric agnostic and provides an interpretable output for the decision-maker. We test our approach by showing a practical application in electricity demand forecasting. Our results suggest that when we allow for dataset split overlapping, the performances of our methodology are consistently above the baseline model trained on the whole dataset.
翻译:大规模数据分析正在以指数速率增长,因为数据在我们的社会中激增。这种数据丰富的好处是让决策者在以前令人望而却步的情景中实施复杂模型。与此同时,这种数量的数据需要一种分散的思维方法。事实上,深学习模型需要大量的资源,需要分布式培训。本文为分布式学习提出了一种多标准方法。我们的方法在其Chebyshev的配方中采用“加权目标方案编制方法”来构建一套决策规则,优化最先确定的性能指标。这种配方是有益的,因为它既具有模型性,又具有计量性,并为决策者提供了可解释的产出。我们测试我们的方法,在电力需求预测中展示了实用的应用。我们的结果表明,当我们允许数据集的重叠时,我们方法的性能始终高于整个数据集所训练的基线模型。