建筑能源预测的机器学习局限性 (Limitations of machine learning for building energy prediction)

Machine learning for building energy prediction has exploded in popularity in recent years, yet understanding its limitations and potential for improvement are lacking. The ASHRAE Great Energy Predictor III (GEPIII) Kaggle competition was the largest building energy meter machine learning competition ever held with 4,370 participants who submitted 39,403 predictions. The test data set included two years of hourly electricity, hot water, chilled water, and steam readings from 2,380 meters in 1,448 buildings at 16 locations. This paper analyzes the various sources and types of residual model error from an aggregation of the competition's top 50 solutions. This analysis reveals the limitations for machine learning using the standard model inputs of historical meter, weather, and basic building metadata. The types of error are classified according to the amount of time errors occur in each instance, abrupt versus gradual behavior, the magnitude of error, and whether the error existed on single buildings or several buildings at once from a single location. The results show machine learning models have errors within a range of acceptability on 79.1% of the test data. Lower magnitude model errors occur in 16.1% of the test data. These discrepancies can likely be addressed through additional training data sources or innovations in machine learning. Higher magnitude errors occur in 4.8% of the test data and are unlikely to be accurately predicted regardless of innovation. There is a diversity of error behavior depending on the energy meter type (electricity prediction models have unacceptable error in under 10% of test data, while hot water is over 60%) and building use type (public service less than 14%, while technology/science is just over 46%).

翻译：近些年来,用于建设能源预测的机器学习在广受欢迎,然而却缺乏了解其局限性和改进潜力。ASHRAE Great Energy Profector III (GEPIII) Kagle竞争是有史以来最大的建筑能源计量机学习竞赛,有4,370名参与者提交了39,403次预测。测试数据集包括16个地点1,448座建筑物中2,380米的小时电、热水、冷却水和蒸汽读数两年,结果显示测试数据的79.1%在可接受性范围内有误差。这一分析揭示了利用历史计量、天气和基本建筑元数据的标准模型投入进行机器学习的局限性。错误类型按照每次发生的时间差数、突然或逐步的行为、误差程度、单一地点的单一建筑物或若干建筑物的误差程度进行分类。结果显示机器学习模型在测试数据的79.1%的可接受性范围内有误差;测试数据的误差为16.1%;测试数据的误差程度为16.1%。这些误差可能通过更多的公共投资或机型数据类型来分类,而更难判数据的误判数据类型为10度则取决于。

相关内容

Machine Learning

关注 2242

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

专知会员服务

39+阅读 · 2020年11月3日

专知会员服务

171+阅读 · 2020年5月10日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日