Given a set of discrete probability distributions, the minimum entropy coupling is the minimum entropy joint distribution that has the input distributions as its marginals. This has immediate relevance to tasks such as entropic causal inference for causal graph discovery and bounding mutual information between variables that we observe separately. Since finding the minimum entropy coupling is NP-Hard, various works have studied approximation algorithms. The work of [Compton, ISIT 2022] shows that the greedy coupling algorithm of [Kocaoglu et al., AAAI 2017] is always within $log_2(e) \approx 1.44$ bits of the optimal coupling. Moreover, they show that it is impossible to obtain a better approximation guarantee using the majorization lower-bound that all prior works have used: thus establishing a majorization barrier. In this work, we break the majorization barrier by designing a stronger lower-bound that we call the profile method. Using this profile method, we are able to show that the greedy algorithm is always within $log_2(e)/e \approx 0.53$ bits of optimal for coupling two distributions (previous best-known bound is within 1 bit), and within $(1 + log_2(e))/2 \approx 1.22$ bits for coupling any number of distributions (previous best-known bound is within 1.44 bits). We also examine a generalization of the minimum entropy coupling problem: Concave Minimum-Cost Couplings. We are able to obtain similar guarantees for this generalization in terms of the concave cost function. Additionally, we make progress on the open problem of [Kova\v{c}evi\'c et al., Inf. Comput. 2015] regarding NP membership of the minimum entropy coupling problem by showing that any hardness of minimum entropy coupling beyond NP comes from the difficulty of computing arithmetic in the complexity class NP. Finally, we present exponential-time algorithms for computing the exactly optimal solution.
翻译:根据一套离散概率分布值 { 离散概率分布值, 最小的联结值是最小的联运算法, 其输入分布值是最小的联运。 这与下列任务直接相关: 用于因果图形发现和我们分别观察的变量之间的相互信息。 由于找到最小的联结值是NP- Hard, 各种作品都研究了近似算法。 [Compton, ISIT 2022] 的工作显示, [Kocaoglu 等人, AAAI 2017] 的贪婪联运算法总是在 $log_ e) (Approx 1. 44美元 的最小联运分配值) 。 此外, 它们表明, 无法利用所有先前作品所使用的主要化下限获得更好的近端保障 : 从而建立一个主要连接屏障 。 我们通过配置一个更低调的配置数据方法来打破主要障碍。 使用这个配置方法, 我们能够显示贪婪的算法总是在 $log_ e) (e) / appreal adrox more more more more more more more more more mess yalal yal deal deal deal 。