A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data exploration system -- that leverages, on the one hand, Machine Learning and, on the other hand, semantics for the purpose of Data Management (DM). Our vision is to develop a classic unified, comprehensive platform that provides extensive access to open datasets, and we demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics. INODE offers sustainable services in (a) data modeling and linking, (b) integrated query processing using natural language, (c) guidance, and (d) data exploration through visualization, thus facilitating the user in discovering new insights. We demonstrate that our system is uniquely accessible to a wide range of users from larger scientific communities to the public. Finally, we briefly illustrate how this work paves the way for new research opportunities in DM.
翻译:成熟的数据探索系统必须结合不同的访问模式和在探索过程中指导用户的强大概念,既对数据发现和数据连接具有反应性和预测性。这种系统是我们社区真正有机会满足不同领域和数据科学专长的用户的需要。我们引入了INODE -- -- 即终端到终端数据探索系统 -- -- 一方面利用机器学习和语义来进行数据管理(DM),我们的愿景是开发一个典型的统一、全面平台,广泛提供开放数据集,我们在癌症生物标志、研究和创新政策制定和天体物理学领域的三大应用案例中展示了这一点。 INODE在(a) 数据建模和连接、(b) 利用自然语言进行综合查询处理、(c) 指导、(d) 通过可视化进行数据探索,从而便利用户发现新的洞察。我们证明我们的系统对于来自较大科学界到公众的广大用户来说是独一无二的。最后,我们简要地说明这项工作如何为研究机会铺平了道路。