We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sequences $X^n$ and $Y^n$, respectively. The goal is to find rate-limited encodings $f(x^n)$ and $g(z^n)$ that maximize the mutual information $I(f(X^n); g(Y^n))/n$. We discuss connections of this problem with hypothesis testing against independence, pattern recognition, and the information bottleneck method. Improving previous cardinality bounds for the inner and outer bounds allows us to thoroughly study the special case of a binary symmetric source and to quantify the gap between the inner and the outer bound in this special case. Furthermore, we investigate a multiple description (MD) extension of the Chief Operating Officer (CEO) problem with mutual information constraint. Surprisingly, this MD-CEO problem permits a tight single-letter characterization of the achievable region.
翻译:我们研究了一种由两组问题驱动的新颖的多端源代码设置。两个单独的编码器分别对两个(一.d)序列($X ⁇ n美元和$Y ⁇ n美元)进行观察,目的是找到限制费率的编码($f(x ⁇ n)美元和$g(z ⁇ n)美元,使相互信息最大化 $I(f(X ⁇ n);g(Y ⁇ n))/n美元。我们讨论了这一问题与独立、模式识别和信息瓶颈方法的假设测试之间的联系。改进以前的内外部界限基点界限,使我们能够彻底研究二进制对称源的特殊案例,并量化这一特殊案例中内外部界限之间的差距。此外,我们调查首席业务干事(CEO)问题的多重描述(MD)延伸,同时对相互信息进行限制。令人惊讶的是,MD-CEO问题允许对可实现的区域进行严格的单字母定性。