This paper investigates the theoretical and empirical performance of Fisher-Pitman-type permutation tests for assessing the equality of unknown Poisson mixture distributions. Building on nonparametric maximum likelihood estimators (NPMLEs) of the mixing distribution, these tests are theoretically shown to be able to adapt to complicated unspecified structures of count data and also consistent against their corresponding ANOVA-type alternatives; the latter is a result in parallel to classic claims made by Robinson (Robinson, 1973). The studied methods are then applied to a single-cell RNA-seq data obtained from different cell types from brain samples of autism subjects and healthy controls; empirically, they unveil genes that are differentially expressed between autism and control subjects yet are missed using common tests. For justifying their use, rate optimality of NPMLEs is also established in settings similar to nonparametric Gaussian (Wu and Yang, 2020a) and binomial mixtures (Tian et al., 2017; Vinayak et al., 2019).
翻译:本文调查了Fisher-Pitman型变异试验的理论和经验表现,以评估未知Poisson混合物分布的均衡性。根据混合分布的非参数最大可能性估计器(NPMLEs),这些试验理论上证明能够适应复杂的不具体的数据结构,并与相应的ANOVA型替代物相一致;后者的结果与Robinson提出的典型主张(Robinson,1973年);研究过的方法随后适用于从自闭症对象和健康控制的大脑样本中不同细胞类型获得的单细胞RNA-seq数据;在经验上,这些试验揭示出自闭症和控制对象之间有区别表达的基因,但利用普通试验却忽略了这些基因;为了证明使用这种基因是合理的,在类似非参数高斯(Wu和Yang,2020年a)和二元混合物(Tian等人,2017年;Vinayak等人,2019年)。