The information bottleneck (IB) approach, initially introduced by Tishby et al. to assess the "compression--relevance" tradeoff for a remote source coding problem in communications, gains popularity recently in its application to modern machine learning (ML). Despite its seemingly simple form, the solution to IB problem remains largely unknown, and can only be assessed numerically even in the simple setting of Gaussian mixture model that is of fundamental significance in ML. In this paper, by combining ideas of hard quantization and soft nonlinear transformation, we derive closed-form achievable bounds for the IB problem under the above setting. The derived bounds establish surprisingly close behavior to the (numerically) optimal IB solution obtained by Blahut--Arimoto (BA) algorithm, on both synthetic and real-world (so non-Gaussian mixture) datasets, suggesting possibly wider applicability of our results.
翻译:暂无翻译