Background: Understanding cellular diversity throughout the body is essential for elucidating the complex functions of biological systems. Recently, large-scale single-cell omics datasets, known as omics atlases, have become available. These atlases encompass data from diverse tissues and cell-types, providing insights into the landscape of cell-type-specific gene expression. However, the isolated effect of the tissue environment has not been thoroughly investigated. Evaluating this isolated effect is challenging due to statistical confounding with cell-type effects, arising from significant biases in the combinations of tissues and cell-types within the body. Results: This study introduces a novel data analysis framework, named the Combinatorial Sub-dataset Extraction for Confounding Reduction (COSER), which addresses statistical confounding by using graph theory to enumerate appropriate sub-datasets. COSER enables the assessment of isolated effects of discrete variables in single cells. Applying COSER to the Tabula Muris Senis single-cell transcriptome atlas, we characterized the isolated impact of tissue environments. Our findings demonstrate that some of genes are markedly affected by the tissue environment, particularly in modulating intercellular diversity in immune responses and their age-related changes. Conclusion: COSER provides a robust, general-purpose framework for evaluating the isolated effects of discrete variables from large-scale data mining. This approach reveals critical insights into the interplay between tissue environments and gene expression.
翻译:暂无翻译