Binary matrix factorisation is an essential tool for identifying discrete patterns in binary data. In this paper we consider the rank-k binary matrix factorisation problem (k-BMF) under Boolean arithmetic: we are given an n x m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n x k and k x m respectively, which minimise the distance between X and the Boolean product of A and B in the squared Frobenius distance. We present a compact and two exponential size integer programs (IPs) for k-BMF and show that the compact IP has a weak LP relaxation, while the exponential size IPs have a stronger equivalent LP relaxation. We introduce a new objective function, which differs from the traditional squared Frobenius objective in attributing a weight to zero entries of the input matrix that is proportional to the number of times the zero is erroneously covered in a rank-k factorisation. For one of the exponential size IPs we describe a computational approach based on column generation. Experimental results on synthetic and real word datasets suggest that our integer programming approach is competitive against available methods for k-BMF and provides accurate low-error factorisations.
翻译:二进制矩阵要素化是确定二进制数据中离散模式的基本工具。 在本文中, 我们考虑在布尔算法中, 排名二进制矩阵因子化问题( k- BMF ) : 我们得到一个nx m binary 矩阵X, 可能缺少条目, 需要找到两个维度为 n x k 和 k x x 的二进制矩阵A 和 B 的二进制矩阵 A 和 B 维度为x x k k 和 k x 的 维度, 从而在方格方位要素化中最小化 X 和 B 的 Boolean 产品之间的距离。 我们为 k- BMF 提供了一种紧凑和两个指数大小的整数化程序(IP IP), 并表明 缩放的 IP 弱 LP 放松, 而 和 指数 IP 的 等量 等量化 等量化 IP 。 我们引入了一个新的目标功能, 与传统的正方方方方位 Fbenus 目标化程序化方法不同,,,,, 将输入 的重量比重化法与 和 Krobenough 的 相配法则具有竞争力。