Similar to Open Data initiatives, data science as a community has launched initiatives for sharing not only data but entire pipelines, derivatives, artifacts, etc. (Open Data Science). However, the few efforts that exist focus on the technical part on how to facilitate sharing, conversion, etc. This vision paper goes a step further and proposes KEK, an open federated data science platform that does not only allow for sharing data science pipelines and their (meta)data but also provides methods for efficient search and, in the ideal case, even allows for combining and defining pipelines across platforms in a federated manner. In doing so, KEK addresses the so far neglected challenge of actually finding artifacts that are semantically related and that can be combined to achieve a certain goal.
翻译:与开放数据倡议相似的是,数据科学作为一个社区发起了各种倡议,不仅分享数据,而且分享整个管道、衍生物、文物等(开放数据科学),然而,在技术方面现有的少量努力侧重于如何促进共享、转换等。 本愿景文件更进一步,并提议科索沃能源公司,这是一个开放的联邦数据科学平台,不仅允许共享数据科学管道及其(元)数据,而且还提供了高效搜索的方法,在理想情况下,甚至允许以联合方式合并和界定跨平台的管道。在这样做的过程中,科索沃能源公司解决了迄今为止被忽视的挑战,即实际找到与语言相关并可合并以实现特定目标的文物。