We address the problem of learning binary decision trees that partition data for some downstream task. We propose to learn discrete parameters (i.e., for tree traversals and node pruning) and continuous parameters (i.e., for tree split functions and prediction functions) simultaneously using argmin differentiation. We do so by sparsely relaxing a mixed-integer program for the discrete parameters, to allow gradients to pass through the program to continuous parameters. We derive customized algorithms to efficiently compute the forward and backward passes. This means that our tree learning procedure can be used as an (implicit) layer in arbitrary deep networks, and can be optimized with arbitrary loss functions. We demonstrate that our approach produces binary trees that are competitive with existing single tree and ensemble approaches, in both supervised and unsupervised settings. Further, apart from greedy approaches (which do not have competitive accuracies), our method is faster to train than all other tree-learning baselines we compare with. The code for reproducing the results is available at https://github.com/vzantedeschi/LatentTrees.
翻译:我们处理学习分解某些下游任务数据的二进制决定树的问题。 我们提议同时使用 armin 差异, 学习离散参数( 树的横跨和节线的剪裁) 和连续参数( 树分函数和预测函数) 。 我们这样做的方法是: 分散参数的混合点数程序, 允许梯度通过程序传递到连续参数 。 我们制作定制算法, 以有效计算前向和后向通道 。 这意味着, 我们的树学习程序可以在任意的深层网络中用作( 隐含的) 层, 并且可以优化任意丢失功能 。 我们证明, 我们的方法产生的二进制树与现有的单一树和共性方法具有竞争力, 在受监管和不受监督的环境中。 此外, 除了贪婪的方法( 没有竞争性的弧度), 我们的方法比我们比较的所有其他树学习基线都更快。 复制结果的代码可以在 https://github.com/vzateeschi/LatTres 上查阅 。