We propose an estimation procedure for covariation in wide compositional data sets. For compositions, widely-used logratio variables are interdependent due to a common reference. Logratio uncorrelated compositions are linearly independent before the unit-sum constraint is imposed. We show how they are used to construct bespoke shrinkage targets for logratio covariance matrices and test a simple procedure for partial correlation estimates on both a simulated and a single-cell gene expression data set. For the underlying counts, different zero imputations are evaluated. The partial correlation induced by the closure is derived analytically. Data and code are available from GitHub.
翻译:暂无翻译