In the field of materials science and engineering, statistical analysis and machine learning techniques have recently been used to predict multiple material properties from an experimental design. These material properties correspond to response variables in the multivariate regression model. This study conducts a penalized maximum likelihood procedure to estimate model parameters, including the regression coefficients and covariance matrix of response variables. In particular, we employ $l_1$-regularization to achieve a sparse estimation of regression coefficients and the inverse covariance matrix of response variables. In some cases, there may be a relatively large number of missing values in response variables, owing to the difficulty in collecting data on material properties. A method to improve prediction accuracy under the situation with missing values incorporates a correlation structure among the response variables into the statistical model. The expectation and maximization algorithm is constructed, which enables application to a data set with missing values in the responses. We apply our proposed procedure to real data consisting of 22 material properties.
翻译:在材料科学和工程领域,最近利用统计分析和机器学习技术从实验设计中预测多种物质特性,这些物质特性与多变回归模型中的反应变量相对应,该研究对模型参数(包括回归系数和响应变量的共变矩阵)进行了最有可能的估算程序,对模型参数(包括回归系数和响应变量的共变矩阵)进行了最有害的估计,特别是我们采用1美元固定法,对回归系数和对应变量的反相变矩阵进行少许估计,在某些情况下,由于难以收集材料属性数据,反应变量中缺少的数值可能相对较多。在缺少值的情况下,改进预测准确性的方法将对应变量的关联结构纳入统计模型。构建了预期和最大化算法,从而能够应用答复中缺少值的数据集。我们提出的程序适用于由22个物质属性组成的真实数据。