We consider multi-party protocols for classification that are motivated by applications such as e-discovery in court proceedings. We identify a protocol that guarantees that the requesting party receives all responsive documents and the sending party discloses the minimal amount of non-responsive documents necessary to prove that all responsive documents have been received. This protocol can be embedded in a machine learning framework that enables automated labeling of points and the resulting multi-party protocol is equivalent to the standard one-party classification problem (if the one-party classification problem satisfies a natural independence-of-irrelevant-alternatives property). Our formal guarantees focus on the case where there is a linear classifier that correctly partitions the documents.
翻译:我们考虑多缔约方分类协议,这些协议的动机是法院诉讼中的电子发现等应用程序。我们确定一项议定书,保证请求方收到所有响应性文件,并且发送方披露证明所有响应性文件已经收到所必需的最低数量的非响应性文件。该议定书可以嵌入一个机器学习框架,使点的自动标签能够自动化,由此产生的多缔约方协议等同于标准的一方分类问题(如果一方分类问题满足了自然的不相关替代财产的独立性)。我们的正式保证侧重于有一个直线分类器正确分割文件的情况。