We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general perspective of clustering distributions, which emphasizes that the statistical models underlying distortion-based methods may not be descriptive enough. Instead, we optimize a mixed-variable objective measuring the conformity of data within each cluster to the introduced sBeta density function, whose parameters are constrained and estimated jointly with binary assignment variables. Our versatile formulation approximates a variety of parametric densities for modeling cluster data, and enables to control the cluster-balance bias. This yields highly competitive performances for efficient unsupervised adjustment of black-box predictions in a variety of scenarios, including one-shot classification and unsupervised domain adaptation in real-time for road segmentation. Implementation is available at https://github.com/fchiaroni/Clustering_Softmax_Predictions.
翻译:我们探索对深神经网络的软成像预测进行分组,并采用称为K-SBetas的新型概率分组方法。在群集分布的一般情况下,现有方法侧重于探索针对简单数据(例如KL差异)的扭曲措施,作为标准的欧几里德距离的替代物;我们从总体角度介绍群集分布,强调基于扭曲方法的统计模型可能不够描述性;相反,我们优化了一种混合可变目标,以衡量每个组内的数据与引入的SBeta密度功能是否相符,其参数受限制,并与二进制分配变量一起估算。我们的多功能性配方组合配方组合数据大致接近各种参数密度,并能够控制群集平衡偏差。这产生高度竞争性的性能,以便在各种情景中高效、不受监督地调整黑盒预测,包括一次性分类和实时路段不受监控的域调整。执行可在https://github.com/fchiaroni/Clustering_Softmaxistry上查阅。