High-throughput sequencing technology allows us to test the compositional difference of bacteria in different populations. One important feature of human microbiome data is that it often includes a large number of zeros. Such data can be treated as being generated from a two-part model that includes a zero point-mass. Motivated by analysis of such non-negative data with excessive zeros, we introduce several truncated rank-based two-group and multi-group tests for such data, including a truncated rank-based Wilcoxon rank-sum test for two-group comparison and two truncated Kruskal-Wallis tests for multi-group comparison. We show both analytically through asymptotic relative efficiency analysis and by simulations that the proposed tests have higher power than the standard rank-based tests, especially when the proportion of zeros in the data is high. The tests can also be applied to repeated measurements of compositional data via simple within-subject permutations. In a simple before-and-after treatment experiment, the within-subject permutation is similar to the paired rank test. However, the proposed tests handle the excessive zeros, which leads to a better power. We apply the tests to the analysis of a gut microbiome data set to compare the microbiome compositions of healthy and pediatric Crohn's disease patients and to assess the treatment effects on microbiome compositions. We identify several bacterial genera that are missed by the standard rank-based tests.
翻译:高通量测序技术允许我们测试不同人群中细菌的构成差异。 人类微生物数据的一个重要特征是,它通常包含大量零。 这些数据可以被视为由包含零点质量的两部分模型生成。 通过分析此类非负值数据,我们通过分析数据中零度过高的数据,引入了数种基于等级的两组和多组数据测试, 包括分流的、 按等级排列的威尔科松级级和分级测试, 用于两组比较, 以及两次用于多组比较的粗略Kruskal- Wallis测试。 我们通过无序相对效率分析以及模拟, 显示拟议测试的功率高于标准等级测试, 特别是当数据中零的比例很高时。 这些测试还可以用于通过简单的本位内分辨来重复测量成份数据。 在简单的治疗试验前后, 内位测试类似于配对级等级的Kruskal- Wallis 测试。 然而, 我们通过无序相对性相对效率分析, 模拟, 提议的微生物结构测试, 也就是对数种性测试, 我们将测测测测算结果的机的数值比。 。 我们将测测测测测测测测测测测, 。 测测测测测测测。 的微生物的机的机的机的机的机的机的机的机的机的底的底。