In immunology studies, flow cytometry is a commonly used multivariate single-cell assay. One key goal in flow cytometry analysis is to pinpoint the immune cells responsive to certain stimuli. Statistically, this problem can be translated into comparing two protein expression probability density functions (PDFs) before and after the stimulus; the goal is to pinpoint the regions where these two pdfs differ. In this paper, we model this comparison as a multiple testing problem. First, we partition the sample space into small bins. In each bin we form a hypothesis to test the existence of differential pdfs. Second, we develop a novel multiple testing method, called TEAM (Testing on the Aggregation tree Method), to identify those bins that harbor differential pdfs while controlling the false discovery rate (FDR) under the desired level. TEAM embeds the testing procedure into an aggregation tree to test from fine- to coarse-resolution. The procedure achieves the statistical goal of pinpointing differential pdfs to the smallest possible regions. TEAM is computationally efficient, capable of analyzing large flow cytometry data sets in much shorter time compared with competing methods. We applied TEAM and competing methods on a flow cytometry data set to identify T cells responsive to the cytomeglovirus (CMV)-pp65 antigen stimulation. TEAM successfully identified the monofunctional, bifunctional, and polyfunctional T cells while the competing methods either did not finish in a reasonable time frame or provided less interpretable results. Numerical simulations and theoretical justifications demonstrate that TEAM has asymptotically valid, powerful, and robust performance. Overall, TEAM is a computationally efficient and statistically powerful algorithm that can yield meaningful biological insights in flow cytometry studies.
翻译:在免疫学研究中, 流动细胞测量是一种常用的多变单细胞分析。 流动细胞测量分析的一个关键目标是定位对某些刺激性能有反应的免疫细胞。 从统计上看, 这个问题可以转化成比较两种蛋白表达概率密度函数( PDFs ) 。 目标是将测试程序嵌入一个组状图中, 从细到粗度测试问题。 首先, 我们将样本空间分割成一个常用的多变单细胞。 在每箱中, 我们形成一个假设以测试差异pdf的存在。 其次, 我们开发了一种新型的多重测试方法, 叫做 TEAM( 测试聚合树方法), 在刺激之前和之后, 可以比较两种蛋白色显示差异性电磁力密度( PDFDF) ; TEAM 将测试程序嵌入一个组, 从精细度到粗度测的细胞测试。 程序可以实现将不同易变相的基体积到最小的区域的统计目标。 TEAM是计算高效的, 能够分析大规模流动的流变精度数据, 和精度数据在TEA- 亚性数据计算中, 运行的计算方法可以进行更短的计算。