We consider the problem of parameter estimation in slowly varying regression models with sparsity constraints. We formulate the problem as a mixed integer optimization problem and demonstrate that it can be reformulated exactly as a binary convex optimization problem through a novel exact relaxation. The relaxation utilizes a new equality on Moore-Penrose inverses that convexifies the non-convex objective function while coinciding with the original objective on all feasible binary points. This allows us to solve the problem significantly more efficiently and to provable optimality using a cutting plane-type algorithm. We develop a highly optimized implementation of such algorithm, which substantially improves upon the asymptotic computational complexity of a straightforward implementation. We further develop a heuristic method that is guaranteed to produce a feasible solution and, as we empirically illustrate, generates high quality warm-start solutions for the binary optimization problem. We show, on both synthetic and real-world datasets, that the resulting algorithm outperforms competing formulations in comparable times across a variety of metrics including out-of-sample predictive performance, support recovery accuracy, and false positive rate. The algorithm enables us to train models with 10,000s of parameters, is robust to noise, and able to effectively capture the underlying slowly changing support of the data generating process.
翻译:我们把问题作为一个混合整整优化问题来提出,并表明可以通过新颖的精确放松,将问题完全重塑为二进制曲线优化问题。放松在摩尔-彭罗斯反面上采用了一种新的平等,将非convex目标功能混为一流,同时与所有可行的二进制点的最初目标相吻合。这使我们能够大大高效地解决问题,并使用切割飞机型算法实现最优化。我们发展了高度优化的这种算法的实施,大大改进了直接执行的无精度计算复杂性。我们进一步开发了一种超常方法,保证产生可行的解决办法,并且如我们的经验所显示的那样,为二进制优化问题带来高质量的热源启动解决方案。我们在合成和现实世界的数据集上都显示,由此产生的算法在各种指标的可比时间里(包括超模量预测性性性能、支持回收准确性能和不实的精确性能支持率)中,能够以10000的稳健健度模型来有效测量数据。