The production, shipping, usage, and disposal of consumer goods have a substantial impact on greenhouse gas emissions and the depletion of resources. Modern retail platforms rely heavily on Machine Learning (ML) for their search and recommender systems. Thus, ML can potentially support efforts towards more sustainable consumption patterns, for example, by accounting for sustainability aspects in product search or recommendations. However, leveraging ML potential for reaching sustainability goals requires data on sustainability. Unfortunately, no open and publicly available database integrates sustainability information on a product-by-product basis. In this work, we present the GreenDB, which fills this gap. Based on search logs of millions of users, we prioritize which products users care about most. The GreenDB schema extends the well-known schema.org Product definition and can be readily integrated into existing product catalogs to improve sustainability information available for search and recommendation experiences. We present our proof of concept implementation of a scraping system that creates the GreenDB dataset.
翻译:消费品的生产、航运、使用和处置对温室气体排放和资源耗竭有重大影响。现代零售平台在搜索和建议系统方面严重依赖机器学习(ML),因此,ML有可能支持更可持续的消费模式,例如,在产品搜索或建议中考虑到可持续性方面。然而,利用ML潜力实现可持续性目标需要可持续性数据。不幸的是,没有开放和公开的数据库将逐项产品的可持续性信息纳入其中。在这项工作中,我们介绍了填补这一差距的绿色数据库。根据数百万用户的搜索日志,我们优先考虑哪些产品用户最关心这一问题。GreenDB Schema扩展了众所周知的schema.org产品定义,并且可以很容易地纳入现有的产品目录,以改进可供搜索和建议经验使用的可持续性信息。我们提供了我们实施概念的证据,以建立一个绿色数据库。