The "Decentralised Web" (DW) is an evolving concept, which encompasses technologies aimed at providing greater transparency and openness on the web. The DW relies on independent servers (aka instances) that mesh together in a peer-to-peer fashion to deliver a range of services (e.g. micro-blogs, image sharing, video streaming). However, toxic content moderation in this decentralised context is challenging. This is because there is no central entity that can define toxicity, nor a large central pool of data that can be used to build universal classifiers. It is therefore unsurprising that there have been several high-profile cases of the DW being misused to coordinate and disseminate harmful material. Using a dataset of 9.9M posts from 117K users on Pleroma (a popular DW microblogging service), we quantify the presence of toxic content. We find that toxic content is prevalent and spreads rapidly between instances. We show that automating per-instance content moderation is challenging due to the lack of sufficient training data available and the effort required in labelling. We therefore propose and evaluate ModPair, a model sharing system that effectively detects toxic content, gaining an average per-instance macro-F1 score 0.89.
翻译:“分散化网络”(DW)是一个不断发展的概念,它包括了旨在在网上提供更大透明度和公开性的技术。DW依靠独立服务器(aka situes),这些服务器以同行对等的方式拼凑在一起,以提供一系列服务(例如微博客、图像共享、视频流)。然而,在这种分散化背景下有毒内容的温和度具有挑战性。这是因为没有中央实体能够定义毒性,也没有庞大的中央数据库,可用于建立通用分类器。因此,没有令人惊讶的是,有几起高知名度的DW被滥用来协调和传播有害材料的案例。我们利用117K用户在Pleroma(流行的DW微博服务)上的9.9M的数据集来量化有毒内容的存在。我们发现有毒内容很普遍,而且在不同实例之间迅速扩散。我们发现,由于缺少足够的培训数据,而且在标签方面需要付出的努力,因此使每份内容的节制具有挑战性。我们因此建议并评估了ModPair, 一种平均的模型共享系统。