In modern experimental science there is a commonly encountered problem of estimating the coefficients of a linear regression in the context where the variables of interest can never be observed simultaneously. Assuming that the global experiment can be decomposed into sub-experiments with distinct first moments, we propose two estimators of the linear regression that take this additional information into account. We consider an estimator based on moments, and an estimator based on optimal transport theory. These estimators are proven to be consistent as well as asymptotically Gaussian under weak hypotheses. The asymptotic variance has no explicit expression, except in some particular cases, for which reason a stratified bootstrap approach is developed to build confidence intervals for the estimated parameters, whose consistency is also shown. A simulation study, assessing and comparing the finite sample performances of these estimators, demonstrated the advantages of the bootstrap approach in multiple realistic scenarios. An application to in vivo experiments, conducted in the context of studying radio-induced adverse effects on mice, revealed important relationships between the biomarkers of interest that could not be identified with the considered naive approach.
翻译:暂无翻译