Environmental health researchers may aim to identify exposure patterns that represent sources, product use, or behaviors that give rise to mixtures of potentially harmful environmental chemical exposures. We present Bayesian non-parametric non-negative matrix factorization (BN^2MF) as a novel method to identify patterns of chemical exposures when the number of patterns is not known a priori. We placed non-negative continuous priors on pattern loadings and individual scores to enhance interpretability and used a clever non-parametric sparse prior to estimate the pattern number. We further derived variational confidence intervals around estimates; this is a critical development because it quantifies the model's confidence in estimated patterns. These unique features contrast with existing pattern recognition methods employed in this field which are limited by user-specified pattern number, lack of interpretability of patterns in terms of human understanding, and lack of uncertainty quantification.
翻译:环境健康研究人员可能旨在确定代表潜在有害环境化学品接触来源、产品使用或行为组合的接触模式。我们提出贝叶西亚非参数非负矩阵因子化(BN ⁇ 2MF),作为在先知模式数目不详的情况下确定化学品接触模式的新方法。我们把非负连续的连续前科放在模式装载和个人分数上,以提高可解释性,并在估计模式数字之前使用智能非参数稀释法。我们进一步推导出估计数的变异信任间隔;这是一个至关重要的发展,因为它量化了模型对估计模式的信心。这些独特的特征与这一领域采用的现有模式识别方法形成对照,后者受用户指定模式数目的限制,在人类理解方面缺乏模式的可解释性,缺乏不确定性的量化。