Microbiome data analyses require statistical tools that can simultaneously decode microbes' reactions to the environment and interactions among microbes. We introduce CARlasso, the first user-friendly open-source and publicly available R package to fit a chain graph model for the inference of sparse microbial networks that represent both interactions among nodes and effects of a set of predictors. Unlike in standard regression approaches, the edges represent the correct conditional structure among responses and predictors that allows the incorporation of prior knowledge from controlled experiments. In addition, CARlasso 1) enforces sparsity in the network via LASSO; 2) allows for an adaptive extension to include different shrinkage to different edges; 3) is computationally inexpensive through an efficient Gibbs sampling algorithm so it can equally handle small and big data; 4) allows for continuous, binary, counting and compositional responses via proper hierarchical structure, and 5) has a similar syntax to lm for ease of use. The package also supports Bayesian graphical LASSO and several of its hierarchical models as well as lower level one-step sampling functions of the CAR-LASSO model for users to extend.
翻译:微生物数据分析需要能够同时解码微生物对环境的反应和微生物之间的相互作用的统计工具。我们引入了CARlasso,这是第一个方便用户的开放源码和公开提供的R软件包,用于计算代表节点相互作用和一组预报器效应的稀散微生物网络的推论链式图解模型。不同于标准的回归方法,边缘代表了反应和预测器之间正确的有条件结构,从而能够纳入受控实验的先前知识。此外,CARlasso 1 1 还通过LASSO在网络中实施扩增功能;2 允许适应性扩展,将不同的缩小到不同的边缘;3 可以通过高效的Gibs抽样算法计算成本低廉,以便同样处理大小的数据;4 允许通过适当的等级结构进行连续、二进制、计数和组成反应,5 便于使用。该软件包还支持Bayesian图形LASSO及其若干等级模型,以及用于用户扩展的CAR-LASSO模型的较低一级单级取样功能。