We present Ver, a data discovery system that identifies project-join views over large repositories of tables that do not contain join path information, and even when input queries are inaccurate. Ver implements a reference architecture to solve both the technical (scale and search) and human (semantic ambiguity, navigating a large number of results) problems of view discovery. We demonstrate users find the view they want when using Ver with a user study and we demonstrate its performance with large-scale end-to-end experiments on real-world datasets containing tens of millions of join paths.
翻译:我们提出了一个“ Ver” 数据发现系统, 用于识别大型表格存储库中不包含联合路径信息的项目和共享观点, 即使输入查询不准确。 Ver 实施一个参考架构, 以解决技术( 规模和搜索) 和人类( 语义模糊, 浏览大量结果) 的视图发现问题。 我们展示用户在用户研究时使用 Ver 时会发现他们想要的视图, 我们通过包含数千万个连接路径的真实世界数据集的大规模端对端实验来展示其性能 。