The World Wide Web, a ubiquitous source of information, serves as a primary resource for countless individuals, amassing a vast amount of data from global internet users. However, this online data, when scraped, indexed, and utilized for activities like web crawling, search engine indexing, and, notably, AI model training, often diverges from the original intent of its contributors. The ascent of Generative AI has accentuated concerns surrounding data privacy and copyright infringement. Regrettably, the web's current framework falls short in facilitating pivotal actions like consent withdrawal or data copyright claims. While some companies offer voluntary measures, such as crawler access restrictions, these often remain inaccessible to individual users. To empower online users to exercise their rights and enable companies to adhere to regulations, this paper introduces a user-controlled consent tagging framework for online data. It leverages the extensibility of HTTP and HTML in conjunction with the decentralized nature of distributed ledger technology. With this framework, users have the ability to tag their online data at the time of transmission, and subsequently, they can track and request the withdrawal of consent for their data from the data holders. A proof-of-concept system is implemented, demonstrating the feasibility of the framework. This work holds significant potential for contributing to the reinforcement of user consent, privacy, and copyright on the modern internet and lays the groundwork for future insights into creating a more responsible and user-centric web ecosystem.
翻译:暂无翻译