There is a growing interest in product aesthetics analytics and design. However, the lack of available large-scale data that covers various variables and information is one of the biggest challenges faced by analysts and researchers. In this paper, we present our multidisciplinary initiative of developing a comprehensive automotive dataset from different online sources and formats. Specifically, the created dataset contains 1.4 million images from 899 car models and their corresponding model specifications and sales information over more than ten years in the UK market. Our work makes significant contributions to: (i) research and applications in the automotive industry; (ii) big data creation and sharing; (iii) database design; and (iv) data fusion. Apart from our motivation, technical details and data structure, we further present three simple examples to demonstrate how our data can be used in business research and applications.
翻译:对产品审美分析和设计的兴趣日益浓厚,然而,缺乏涵盖各种变量和信息的大规模现有数据是分析家和研究人员面临的最大挑战之一。在本文件中,我们介绍了我们从不同在线来源和格式开发综合汽车数据集的多学科举措。具体地说,所创建的数据集包含来自899个汽车模型的140万张图像及其在英国市场10年多的时间里相应的示范规格和销售信息。我们的工作为以下工作做出了重要贡献:(一) 汽车工业的研究和应用;(二) 大数据创建和共享;(三) 数据库设计;以及(四) 数据聚合。除了我们的动机、技术细节和数据结构外,我们还提出了三个简单的例子,以展示我们的数据如何用于商业研究和应用。