Mining groups of genes that consistently co-express is an important problem in biomedical research, where it is critical for applications such as drug-repositioning and designing new disease treatments. Recently, Cooley et al. modeled this problem as Exact Weighted Clique Decomposition (EWCD) in which, given an edge-weighted graph $G$ and a positive integer $k$, the goal is to decompose $G$ into at most $k$ (overlapping) weighted cliques so that an edge's weight is exactly equal to the sum of weights for cliques it participates in. They show EWCD is fixed-parameter-tractable, giving a $4^k$-kernel alongside a backtracking algorithm (together called cricca) to iteratively build a decomposition. Unfortunately, because of inherent exponential growth in the space of potential solutions, cricca is typically able to decompose graphs only when $k \leq 11$. In this work, we establish reduction rules that exponentially decrease the size of the kernel (from $4^k$ to $k2^k$) for EWCD. In addition, we use insights about the structure of potential solutions to give new search rules that speed up the decomposition algorithm. At the core of our techniques is a result from combinatorial design theory called Fisher's inequality characterizing set systems with restricted intersections. We deploy our kernelization and decomposition algorithms (together called DeCAF) on a corpus of biologically-inspired data and obtain over two orders of magnitude speed-up over cricca. As a result, DeCAF scales to instances with $k \geq 17$.
翻译:在生物医学研究中,持续共同表达的基因采矿组是一个重要问题,对于药物再定位和设计新的疾病治疗等应用来说,这是一个重要的问题。最近,Cooley 等人将这一问题模拟为“超重克隆分解 ” ( EWCD ),其中,考虑到一个精度加权的图形$G$和正整数美元,目标是将G$分解成最多为美元(重叠)的加权饼干,这样,边缘的重量与它参与的饼干的重量之和完全相等。它们表明EWCD是固定的减速速度计,使EWC具有4k$-内核与回溯算法(统称为C Creamcca) 的特性,从而迭接地形成一种分解。不幸的是,由于潜在解决方案空间的内在指数增长,Crecca通常只有在美元为1美元\leq 11美元时才能解析图表。在这项工作中,我们制定了大幅降低内核量规模的规则(从4Q$到 com-crecial lial lical decal levelopal levelopal le) a rual decal decommagistration requistration restiquest the we sal lautus the we lautal decal decal decal decal decal decal decal decal lautal decal decaltiduutal res) a resutal maxal resutal maxal ma 。我们开始,我们使用了一个快速算算算算出一个比重的系统,我们公司的系统,要求的计算。