With an increasing amount of data in the art world, discovering artists and artworks suitable to collectors' tastes becomes a challenge. It is no longer enough to use visual information, as contextual information about the artist has become just as important in contemporary art. In this work, we present a generic Natural Language Processing framework (called ArtLM) to discover the connections among contemporary artists based on their biographies. In this approach, we first continue to pre-train the existing general English language models with a large amount of unlabelled art-related data. We then fine-tune this new pre-trained model with our biography pair dataset manually annotated by a team of professionals in the art industry. With extensive experiments, we demonstrate that our ArtLM achieves 85.6% accuracy and 84.0% F1 score and outperforms other baseline models. We also provide a visualisation and a qualitative analysis of the artist network built from ArtLM's outputs.
翻译:随着艺术界数据数量的增加,发现适合收藏者口味的艺术家和艺术品成为挑战。 光是使用视觉信息已经不够了, 因为有关艺术家的背景资料在当代艺术中已经变得同样重要。 在这项工作中, 我们提出了一个通用的自然语言处理框架( 称为ArtLM ), 以根据当代艺术家的传记来发现他们之间的联系。 在这个方法中, 我们首先继续以大量未贴标签的艺术相关数据对现有的通用英语模型进行预演。 然后我们用艺术行业专业人员团队手动附加说明的自传双人数据集来微调这个经过预先训练的新模型。 我们通过广泛的实验, 证明我们的ArtLM 实现了85.6%的精度和84.0%的F1分, 并超越了其他基线模型。 我们还对根据ArtLM 产出建立的艺术家网络进行了视觉化和定性分析。