In recent years, the availability of larger amounts of energy data and advanced machine learning algorithms has created a surge in building energy prediction research. However, one of the variables in energy prediction models, occupant behavior, is crucial for prediction performance but hard-to-measure or time-consuming to collect from each building. This study proposes an approach that utilizes the search volume of topics (e.g., education} or Microsoft Excel) on the Google Trends platform as a proxy of occupant behavior and use of buildings. Linear correlations were first examined to explore the relationship between energy meter data and Google Trends search terms to infer building occupancy. Prediction errors before and after the inclusion of the trends of these terms were compared and analyzed based on the ASHRAE Great Energy Predictor III (GEPIII) competition dataset. The results show that highly correlated Google Trends data can effectively reduce the overall RMSLE error for a subset of the buildings to the level of the GEPIII competition's top five winning teams' performance. In particular, the RMSLE error reduction during public holidays and days with site-specific schedules are respectively reduced by 20-30% and 2-5%. These results show the potential of using Google Trends to improve energy prediction for a portion of the building stock by automatically identifying site-specific and holiday schedules.
翻译:近些年来,大量能源数据和先进机器学习算法的可用性导致建设能源预测研究的激增,然而,能源预测模型中的一个变量,即占位行为,对于预测业绩至关重要,但从每座建筑收集的计量或耗时费则难以收集。本研究报告提出一种方法,将谷歌趋势平台上的专题搜索量(例如教育)或微软Excel作为占用行为和使用建筑物的替代物,首先对线性关系进行了研究,以探索能源计量数据和谷歌趋势搜索条件之间的关系,以推断建筑物占用率。在纳入这些术语的趋势前后,预测错误被比较和分析以ASHRAE大能源预测者III(GEPIII)竞争数据集为基础。结果显示,谷歌趋势数据高度相关联,可以有效地将部分建筑物的RUSLE总体误差降低到GEPIII竞赛前五个得分队的业绩水平。特别是,在公共节假日和有具体地点时间表的时,RERLE的误差将分别减少20-30 % 和2-5 % 。这些结果显示,利用GIGO的预测时间段的预测结果将自动地显示,从而显示GIOL的能源的进度。