The production, shipping, usage, and disposal of consumer goods have a substantial impact on greenhouse gas emissions and the depletion of resources. Machine Learning (ML) can help to foster sustainable consumption patterns by accounting for sustainability aspects in product search or recommendations of modern retail platforms. However, the lack of large high quality publicly available product data with trustworthy sustainability information impedes the development of ML technology that can help to reach our sustainability goals. Here we present GreenDB, a database that collects products from European online shops on a weekly basis. As proxy for the products' sustainability, it relies on sustainability labels, which are evaluated by experts. The GreenDB schema extends the well-known schema.org Product definition and can be readily integrated into existing product catalogs. We present initial results demonstrating that ML models trained with our data can reliably (F1 score 96%) predict the sustainability label of products. These contributions can help to complement existing e-commerce experiences and ultimately encourage users to more sustainable consumption patterns.
翻译:消费品的生产、航运、使用和处置对温室气体排放和资源耗尽产生重大影响。机器学习(ML)通过在产品搜索或现代零售平台建议中考虑到可持续性因素,可以帮助促进可持续消费模式。然而,缺乏具有可信赖可持续性信息的高质量公开产品数据妨碍了可信赖的可持续性信息的高质量公共产品数据的发展。我们在这里介绍了绿色数据库,这是一个每周从欧洲在线商店收集产品的数据库。作为产品可持续性的替代,它依赖由专家评价的可持续性标签。GreenDB schema扩展了众所周知的schema.org产品定义,可以很容易地融入现有的产品目录。我们介绍了初步结果,表明接受我们数据培训的ML模型可以可靠地(F1得96%分)预测产品的可持续性标签。这些贡献有助于补充现有的电子商务经验,并最终鼓励用户采用更可持续的消费模式。