Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature. We demonstrate that SVM can be used to balance covariates and estimate average causal effects under the unconfoundedness assumption. Specifically, we adapt the SVM classifier as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups while simultaneously maximizing effective sample size. We also show that SVM is a continuous relaxation of the quadratic integer program for computing the largest balanced subset, establishing its direct relation to the cardinality matching method. Another important feature of SVM is that the regularization parameter controls the trade-off between covariate balance and effective sample size. As a result, the existing SVM path algorithm can be used to compute the balance-sample size frontier. We characterize the bias of causal effect estimation arising from this trade-off, connecting the proposed SVM procedure to the existing kernel balancing methods. Finally, we conduct simulation and empirical studies to evaluate the performance of the proposed methodology and find that SVM is competitive with the state-of-the-art covariate balancing methods.
翻译:支持矢量机(SVM)是机器学习文献中最受欢迎的分类算法之一。我们证明,SVM可用于平衡无根据假设下的共变和估计平均因果关系。具体地说,我们调整SVM分类法,作为以内核为基础的加权程序,最大限度地缩小处理和控制组之间的最大平均差异,同时尽量扩大有效的抽样规模。我们还表明,SVM是计算最大平衡子集的二次整流程序的持续放松,从而确立其与主要匹配方法的直接关系。SVM的另一个重要特点是,正规化参数控制着共变平衡与有效样本大小之间的取舍。因此,现有的SVM路径算法可以用来计算平衡大小的边际。我们从这一交易中得出因果关系估计的偏差,将拟议的SVM程序与现有的内核平衡方法联系起来。最后,我们进行模拟和经验研究,以评估拟议方法的性能,并发现SVM具有竞争力,可以与状态的共变平衡方法竞争。