In recent years animal diet has been receiving increased attention, in particular examining the impact of pasture-based feeding strategies on the quality of milk and dairy products, in line with the increased prevalence of grass-fed dairy products appearing on market shelves. To date, there are limited testing methods available for the verification of grass-fed dairy therefore these products are susceptible to food fraud and adulteration. Hence statistical tools studying potential differences among milk samples coming from animals on different feeding systems are required, thus providing increased security around the authenticity of the products. Infrared spectroscopy techniques are widely used to collect data on milk samples and to predict milk related traits. While these data are routinely used to predict the composition of the macro components of milk, each spectrum provides a reservoir of unharnessed information about the sample. The interpretation of these data presents a number of challenges due to their high-dimensionality and the relationships amongst the spectral variables. In this work we propose a modification of the standard factor analysis to induce a parsimonious summary of spectroscopic data. The procedure maps the observations into a low-dimensional latent space while simultaneously clustering observed variables. The method indicates possible redundancies in the data and it helps disentangle the complex relationships among the wavelengths. A flexible Bayesian estimation procedure is proposed for model fitting, providing reasonable values for the number of latent factors and clusters. The method is applied on milk mid-infrared spectroscopy data from dairy cows on different pasture and non-pasture based diets, providing accurate modelling of the data correlation, the clustering of variables and information on differences between milk samples from cows on different diets.
翻译:近年来,动物饮食日益受到重视,特别是根据市场架子上草食奶制品日益流行的情况,审查牧场喂养战略对牛奶和奶制品质量的影响,特别是审查牧场喂养战略对牛奶和奶制品质量的影响。迄今为止,用于检验草食奶制品的检测方法有限,因此,这些产品容易发生食品欺诈和通奸。因此,需要统计工具研究不同喂养系统中动物奶样之间潜在的差异,从而增加产品真实性的安全性。红外光谱技术被广泛用于收集牛奶样品数据,预测与牛奶有关的特性。这些数据通常用于预测牛奶宏观成分的相对性构成,但每一频谱都储存了有关样品的不完善信息。这些数据的解释表明,由于高度和光谱变量之间的关系,因此存在若干挑战。在这项工作中,我们建议修改标准要素分析,以得出令人误解的光谱性数据摘要。程序将观测结果绘制为低维度的暗暗层空间,同时将观察到的变量分组,同时将这些数据用于预测。每种频谱,而每种频谱的样本显示,对这些数据的解释性关系是:关于牛的精确性关系中值的精确性关系;关于牛的精确度的估计关系是提供数据和机尾值方面的拟议数据;关于牛的精确性关系;为数据,关于牛的精确度的估计是提供数据,关于牛的精确性关系,为数据和机床面的精确性分析。提供数据,根据的精确性分析程序,提供数据,为根据。提供数据结构的精确性分析程序,提供数据,为根据数据,为根据数据结构的精确性关系是提供数据,根据。