In this paper, we present "BIKED," a dataset comprised of 4500 individually designed bicycle models sourced from hundreds of designers. We expect BIKED to enable a variety of data-driven design applications for bicycles and generally support the development of data-driven design methods. The dataset is comprised of a variety of design information including assembly images, component images, numerical design parameters, and class labels. In this paper, we first discuss the processing of the dataset and present the various features provided. We then illustrate the scale, variety, and structure of the data using several unsupervised clustering studies. Next, we explore a variety of data-driven applications. We provide baseline classification performance for 10 algorithms trained on differing amounts of training data. We then contrast classification performance of three deep neural networks using parametric data, image data, and a combination of the two. Using one of the trained classification models, we conduct a Shapley Additive Explanations Analysis to better understand the extent to which certain design parameters impact classification predictions. Next, we test bike reconstruction and design synthesis using two Variational Autoencoders (VAEs) trained on images and parametric data. We furthermore contrast the performance of interpolation and extrapolation tasks in the original parameter space and the latent space of a VAE. Finally, we discuss some exciting possibilities for other applications beyond the few actively explored in this paper and summarize overall strengths and weaknesses of the dataset.
翻译:在本文中,我们提出“BIKED”,这是一个由来自数百名设计师的4500个单人设计的自行车模型组成的数据集。我们期望BIKED能够为自行车提供各种数据驱动的设计应用,并普遍支持数据驱动设计方法的开发。数据集由各种设计信息组成,包括组装图像、组件图像、数字设计参数和类类标签。我们首先讨论数据集的处理,并介绍所提供的各种特征。我们然后用若干未经监督的集群研究来说明数据的规模、种类和结构。接下来,我们探索各种数据驱动的应用。我们为10个经过不同数量培训的算法提供基线分类性能,并普遍支持数据驱动设计方法的设计方法。然后,我们用参数、组件、图像、数字设计参数和类类标签等各种设计信息,对三个深层神经网络的性能进行对比。我们使用经过培训的分类模型之一,进行了“沙普利·Additific”解释分析,以更好地了解某些设计参数影响等级预测的程度。接下来,我们用两个Variational Agencorders(VAEPOL) 和空间图象学中的一些前期数据和前期数据进行进一步的比较。