We propose the use of non-parametric, graph-based tests to assess the distributional balance of covariates in observational studies with multi-valued treatments. Our tests utilize graph structures ranging from Hamiltonian paths that connect all of the data to nearest neighbor graphs that maximally separates data into pairs. We consider algorithms that form minimal distance graphs, such as optimal Hamiltonian paths or non-bipartite matching, or approximate alternatives, such as greedy Hamiltonian paths or greedy nearest neighbor graphs. Extensive simulation studies demonstrate that the proposed tests are able to detect the misspecification of matching models that other methods miss. Contrary to intuition, we also find that tests ran on well-formed approximate graphs do better in most cases than tests run on optimally formed graphs, and that a properly formed test on an approximate nearest neighbor graph performs best, on average. In a multi-valued treatment setting with breast cancer data, these graph-based tests can also detect imbalances otherwise missed by common matching diagnostics. We provide a new R package graphTest to implement these methods and reproduce our results.
翻译:我们建议使用非参数、基于图表的测试来评估多值治疗的观测研究中共差的分布平衡。 我们的测试使用从将所有数据连接到最接近的相邻图形的汉密尔顿路径的图形结构。 我们考虑的算法是最小距离图形,如最佳汉密尔顿路径或非双向匹配,或近邻路径等近近近近的近似替代方法。 广泛的模拟研究表明,拟议的测试能够检测出其他方法所错失的匹配模型的错误特性。 与直觉相反,我们还发现,在多数情况下,对完善的近邻图形进行的测试比对最佳成型图形的测试要好,而且对近近邻图形的正确形成测试平均表现最佳。 在使用乳腺癌数据的多值治疗环境中,这些基于图表的测试还可以检测出共同匹配诊断所遗漏的不平衡。 我们提供了一个新的R包图形测试,以实施这些方法并复制我们的结果。