This paper describes the most significant data-related challenges involved in building internet-scale 3D search engines. The discussion centers on the most pressing data management issues in this domain, including model acquisition, support for multiple file formats, asset versioning, data integrity errors, the data lifecycle, intellectual property, and the legality of web crawling. The paper also discusses numerous issues that fall under the rubric of trustworthy computing, including privacy, security, inappropriate content, and copying/remixing of assets. The goal of the paper is to provide an overview of these general issues, illustrated by empirical data drawn from the internet's largest operational search engine. While numerous works have been published on 3D information retrieval, this paper is the first to discuss the real-world challenges that arise in building practical search engines at scale.
翻译:本文件介绍了在建立互联网规模的3D搜索引擎过程中与数据有关的最重大挑战。讨论集中在该领域最紧迫的数据管理问题上,包括模型获取、支持多种文件格式、资产版本、数据完整性错误、数据使用寿命周期、知识产权和网络爬行的合法性。本文还讨论了属于可信赖计算范畴的许多问题,包括隐私、安全、不适当内容和资产复制/重新组合。本文件的目的是概述这些一般性问题,从互联网最大的业务搜索引擎中获取的经验性数据可以说明。虽然关于3D信息检索的众多著作已经出版,但本文还是第一个讨论在大规模建设实用搜索引擎方面出现的现实世界挑战的文件。