通过地物工程增强由数据驱动的能源系统模型:地物工程Python框架 (Augmenting data-driven models for energy systems through feature engineering: A Python framework for feature engineering)

Data-driven modeling is an approach in energy systems modeling that has been gaining popularity. In data-driven modeling, machine learning methods such as linear regression, neural networks or decision-tree based methods are being applied. While these methods do not require domain knowledge, they are sensitive to data quality. Therefore, improving data quality in a dataset is beneficial for creating machine learning-based models. The improvement of data quality can be implemented through preprocessing methods. A selected type of preprocessing is feature engineering, which focuses on evaluating and improving the quality of certain features inside the dataset. Feature engineering methods include methods such as feature creation, feature expansion, or feature selection. In this work, a Python framework containing different feature engineering methods is presented. This framework contains different methods for feature creation, expansion and selection; in addition, methods for transforming or filtering data are implemented. The implementation of the framework is based on the Python library scikit-learn. The framework is demonstrated on a case study of a use case from energy demand prediction. A data-driven model is created including selected feature engineering methods. The results show an improvement in prediction accuracy through the engineered features.

翻译：数据驱动模型是能源系统模型的一种方法,它越来越受欢迎。在数据驱动的模型中,正在应用机器学习方法,如线性回归、神经网络或基于决策树的方法。虽然这些方法不需要域知识,但它们对数据质量是敏感的。因此,提高数据集中的数据质量有利于创建基于学习的机器模型。改进数据质量可以通过预处理方法加以实施。选定的预处理类型是特征工程,重点是评估和改进数据集内某些特征的质量。特性工程方法包括特征创建、特征扩展或特征选择等方法。在这项工作中,提出了包含不同特征工程方法的Python框架。这个框架包含不同特性创建、扩展和选择的方法;此外,还实施了转换或过滤数据的方法。框架的实施以Python图书馆的scikit-learn为基础。框架的实施通过对能源需求预测中的使用案例进行案例研究加以示范。数据驱动模型的创建包括选定的特征工程方法。结果显示通过特征设计改进了预测的准确性。

相关内容

Engineering

关注 6

《工程》是中国工程院（CAE）于2015年推出的国际开放存取期刊。其目的是提供一个高水平的平台，传播和分享工程研发的前沿进展、当前主要研究成果和关键成果；报告工程科学的进展，讨论工程发展的热点、兴趣领域、挑战和前景，在工程中考虑人与环境的福祉和伦理道德，鼓励具有深远经济和社会意义的工程突破和创新，使之达到国际先进水平，成为新的生产力，从而改变世界，造福人类，创造新的未来。期刊链接：https://www.sciencedirect.com/journal/engineering

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

专知会员服务

62+阅读 · 2019年10月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日