The ever-increasing quantity of multivariate process data is driving a need for skilled engineers to analyze, interpret, and build models from such data. Multivariate data analytics relies heavily on linear algebra, optimization, and statistics and can be challenging for students to understand given that most curricula do not have strong coverage in the latter three topics. This article describes interactive software -- the Latent Variable Demonstrator (LAVADE) -- for teaching, learning, and understanding latent variable methods. In this software, users can interactively compare latent variable methods such as Partial Least Squares (PLS), and Principal Component Regression (PCR) with other regression methods such as Least Absolute Shrinkage and Selection Operator (lasso), Ridge Regression (RR), and Elastic Net (EN). LAVADE helps to build intuition on choosing appropriate methods, hyperparameter tuning, and model coefficient interpretation, fostering a conceptual understanding of the algorithms' differences. The software contains a data generation method and three chemical process datasets, allowing for comparing results of datasets with different levels of complexity. LAVADE is released as open-source software so that others can apply and advance the tool for use in teaching or research.
翻译:多变过程数据数量不断增加,使得熟练的工程师需要从这些数据中分析、解释和构建模型。多变数据分析分析严重依赖线性代数、优化和统计,学生可能难以理解,因为大多数课程在后三个专题中没有很强的覆盖面。本篇文章描述了用于教学、学习和理解潜在变量方法的交互式软件 -- -- 低位变量演示(LAVADE) -- -- 用于教学、学习和理解潜伏变量方法的交互式软件。在这个软件中,用户可以交互比较潜在变量方法,如部分最小方(PLS)和主要组成部分回归(PCR),与其他回归方法,如最小绝对缩小和选择操作器(lasso)、脊流回归(RRR)和Elastic Net(EN)进行对比。LAVADE帮助在选择适当方法、超分数调整和模型系数解释方面建立直觉,促进对算法差异的概念理解。软件包含一种数据生成方法和三个化学过程数据集,以便能够将数据集与不同复杂程度的比较。LAVADE作为开放源软件发布,以便用于其他研究或教学工具。