Consensus clustering aggregates partitions in order to find a better fit by reconciling clustering results from different sources/executions. In practice, there exist noise and outliers in clustering task, which, however, may significantly degrade the performance. To address this issue, we propose a novel algorithm -- robust consensus clustering that can find common ground truth among experts' opinions, which tends to be minimally affected by the bias caused by the outliers. In particular, we formalize the robust consensus clustering problem as a constraint optimization problem, and then derive an effective algorithm upon alternating direction method of multipliers (ADMM) with rigorous convergence guarantee. Our method outperforms the baselines on benchmarks. We apply the proposed method to the real-world advertising campaign segmentation and forecasting tasks using the proposed consensus clustering results based on the similarity computed via Kolmogorov-Smirnov Statistics. The accurate clustering result is helpful for building the advertiser profiles so as to perform the forecasting.
翻译:为了解决这一问题,我们建议一种新型算法 -- -- 强有力的协商一致组群,能够在专家意见中找到共同点真相,这往往受外部点造成的偏见影响最小。特别是,我们将稳健的共识组群问题正式确定为制约性优化问题,然后在相互交替的乘数方向方法(ADMM)上获得有效的算法,并有严格的趋同保证。我们的方法优于基准基准基线。我们将拟议方法应用于现实世界广告运动分类和预测任务,我们采用基于通过科尔莫戈夫-斯米尔诺夫统计所计算的相似性的拟议协商一致组群结果。准确的组群结果有助于建立广告商概况,以便进行预测。