We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10 distinctive artistic styles, with 5,000 training images and 1,000 testing images per style. ArtBench-10 has several advantages over previous artwork datasets. Firstly, it is class-balanced while most previous artwork datasets suffer from the long tail class distributions. Secondly, the images are of high quality with clean annotations. Thirdly, ArtBench-10 is created with standardized data collection, annotation, filtering, and preprocessing procedures. We provide three versions of the dataset with different resolutions ($32\times32$, $256\times256$, and original image size), formatted in a way that is easy to be incorporated by popular machine learning frameworks. We also conduct extensive benchmarking experiments using representative image synthesis models with ArtBench-10 and present in-depth analysis. The dataset is available at https://github.com/liaopeiyuan/artbench under a Fair Use license.
翻译:我们引进了ArtBench-10, 即第一级平衡、高质量、有干净附加说明的艺术作品制作基准数据集,它由来自10种独特的艺术风格的60,000幅艺术作品图像组成,每个风格有5,000个培训图像和1,000个测试图像;ArtBench-10比以前的艺术作品数据集具有若干优势;首先,它是阶级平衡的,而大多数以前的艺术作品数据集都受到长尾类分布的影响;第二,图像质量高,有干净的说明;第三,ArtBench-10是用标准化的数据收集、注解、过滤和预处理程序创建的。我们提供了三种版本的具有不同分辨率的数据集(32\times32美元、256美元和原始图像大小),其格式很容易被流行的机器学习框架所采纳。我们还利用有代表性的图像合成模型与ArtBench-10进行广泛的基准测试,并进行深入分析。数据集见https://github.com/liaopeyuan/artbennch 并获得公平使用许可证。