This study presents a broad perspective of hybrid process modeling and optimization combining the scientific knowledge and data analytics in bioprocessing and chemical engineering with a science-guided machine learning (SGML) approach. We divide the approach into two major categories. The first refers to the case where a data-based ML model compliments and makes the first-principle science-based model more accurate in prediction, and the second corresponds to the case where scientific knowledge helps make the ML model more scientifically consistent. We present a detailed review of scientific and engineering literature relating to the hybrid SGML approach, and propose a systematic classification of hybrid SGML models. For applying ML to improve science-based models, we present expositions of the sub-categories of direct serial and parallel hybrid modeling and their combinations, inverse modeling, reduced-order modeling, quantifying uncertainty in the process and even discovering governing equations of the process model. For applying scientific principles to improve ML models, we discuss the sub-categories of science-guided design, learning and refinement. For each sub-category, we identify its requirements, advantages and limitations, together with their published and potential areas of applications in bioprocessing and chemical engineering.
翻译:这项研究介绍了将生物加工和化学工程的科学知识和数据分析的科学知识和数据分析与科学指导机器学习(SGML)方法相结合的混合过程模型和优化的广泛观点。我们将这一方法分为两大类:一是基于数据ML模型的互补性,使以科学为基础的第一原则模型在预测中更加准确;二是科学知识有助于使ML模型在科学上更加一致的案例。我们详细审查了与生物加工和化学工程混合方法有关的科学和工程文献,并提议对混合的SGML模型进行系统分类。为了应用ML来改进以科学为基础的模型,我们介绍了直接序列和平行混合模型及其组合的子分类、反向建模、减序模型、量化过程的不确定性,甚至发现过程模型的管理方程式。为了应用科学原则来改进ML模型,我们讨论了科学指导设计、学习和完善的子类别。关于每个子类别,我们确定了其直接序列和平行混合模型及其组合的子类别、优势和局限性及其已公布的生物工程应用领域和潜在领域。