Binary matrix factorisation is an essential tool for identifying discrete patterns in binary data. In this paper we consider the rank-k binary matrix factorisation problem (k-BMF) under Boolean arithmetic: we are given an n x m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n x k and k x m respectively, which minimise the distance between X and the Boolean product of A and B in the squared Frobenius distance. We present a compact and two exponential size integer programs (IPs) for k-BMF and show that the compact IP has a weak LP relaxation, while the exponential size LPs have a stronger equivalent LP relaxation. We introduce a new objective function, which differs from the traditional squared Frobenius objective in attributing a weight to zero entries of the input matrix that is proportional to the number of times the zero is erroneously covered in a rank-k factorisation. For one of the exponential size IPs we describe a computational approach based on column generation. Experimental results on synthetic and real word datasets suggest that our integer programming approach is competitive against available methods for k-BMF and provides accurate low-error factorisations.
翻译:二进制矩阵要素化是确定二进制数据中离散模式的基本工具。 在本文中, 我们考虑了布尔亚算法中排名二进制矩阵因子化问题( k- BMF) 。 我们得到了一个 n x m bin 矩阵X, 可能缺少条目, 需要找到两个维度为 n x k 和 k x x 的二进制矩阵A 和 B 的二进制矩阵 A 和 B 的 BULEAN 产品之间的距离, 从而在平方方方方位因子化中将 X 与 A 和 B 的 BUL 和 B 的 BUL 产品之间的距离最小化。 我们为 k- BMF 提出了一个缩放的缩放和两个指数化程序( IP IP), 并表明 缩放的 IP 比较弱 LP, 而 LP 的 和 QP 的 缩放大小则更强。 我们引入了一个新的目标功能, 与传统的正方方方位F 目标化程序化法不同,, 将输入一个重量比重的输入的计算法比重和 KBMBM 。