The United States Environmental Protection Agency (EPA) screens thousands of chemicals primarily to differentiate those that are active vs inactive for different types of biological endpoints. However, it is not feasible to test all possible combinations of chemicals, assay endpoints, and concentrations, resulting in a majority of missing combinations. Our goal is to derive posterior probabilities of activity for each chemical by assay endpoint combination. Therefore, we are faced with a task of matrix completion in the context of hypothesis testing for sparse functional data. We propose a Bayesian hierarchical framework, which borrows information across different chemicals and assay endpoints. Our model predicts bioactivity profiles of whether the dose-response curve is constant or not, using low-dimensional latent attributes of chemicals and of assay endpoints. This framework facilitates out-of-sample prediction of bioactivity potential for new chemicals not yet tested, while capturing heteroscedastic residuals. We demonstrate the performance via extensive simulation studies and an application to data from the EPA's ToxCast/Tox21 program. Our approach allows more realistic and stable estimation of potential toxicity as shown for two disease outcomes: neurodevelopmental disorders and obesity.
翻译:美国环境保护署(EPA)筛选了数千种化学品,主要是为了区分不同类型生物终点的活性与非活动性化学品,然而,测试所有可能的化学品组合、检测终点和浓度不可行,导致大多数缺失的组合。我们的目标是通过检测终点组合,得出每种化学品活动的前后概率。因此,我们面临着在对稀有功能数据进行假设测试的背景下完成矩阵的任务。我们提议了一个巴耶斯等级框架,在不同化学品和检测终点之间借阅信息。我们的模型预测了生物活性特征,即剂量反应曲线是否恒定,使用化学品和检测终点的低维潜在特性是否常态。这个框架有助于对尚未测试的新化学品的生物活性潜力作出全面预测,同时捕捉异性残留物。我们通过广泛的模拟研究和应用美国环保署的托克斯-卡斯特/托克斯-21方案的数据,展示了绩效。我们的方法使得能够更现实和稳定地估计两次疾病结果显示的神经紊乱的潜在毒性: