生成式AI和数字共享领域 (Generative AI and the Digital Commons)

Many generative foundation models (or GFMs) are trained on publicly available data and use public infrastructure, but 1) may degrade the "digital commons" that they depend on, and 2) do not have processes in place to return value captured to data producers and stakeholders. Existing conceptions of data rights and protection (focusing largely on individually-owned data and associated privacy concerns) and copyright or licensing-based models offer some instructive priors, but are ill-suited for the issues that may arise from models trained on commons-based data. We outline the risks posed by GFMs and why they are relevant to the digital commons, and propose numerous governance-based solutions that include investments in standardized dataset/model disclosure and other kinds of transparency when it comes to generative models' training and capabilities, consortia-based funding for monitoring/standards/auditing organizations, requirements or norms for GFM companies to contribute high quality data to the commons, and structures for shared ownership based on individual or community provision of fine-tuning data.

翻译：许多生成式基础模型（或GFMs）都是基于公共可用数据和公共基础设施进行培训的，但是1）它们可能会破坏它们所依赖的“数字共享”，2）没有相应的流程将捕获的价值返回给数据制作人和利益相关者。现有的数据权利和保护（主要关注个人拥有的数据和相关的隐私问题）以及版权或许可证模型提供了一些指导性的原则，但这些原则不适用于基于共享数据的模型可能出现的问题。我们概述了GFMs带来的风险以及它们与数字共享的关系，并提出了许多基于治理的解决方案，包括在生成模型的培训和能力方面投资标准化数据集/模型披露和其他类型的透明度，为监测/标准化/审计组织提供财团基础的资金，对GFM公司有要求或规范，要求其向共享共享数据中贡献高质量的数据，以及基于个人或社区提供微调数据的共享所有权的结构。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

【斯坦福HAI白皮书】关于更新国家人工智能研发战略规划的建议，Recommendations on Updating the National Artificial Intelligence Research and Development Strategic Plan

专知会员服务

42+阅读 · 2022年3月15日

【牛津大学】电子医疗记录的生成式对抗网络:应用、评估措施和数据来源综述，A review of Generative Adversarial Networks for Electronic Health Records: applications, evaluation measures and data sources

专知会员服务

24+阅读 · 2022年3月15日