Research process automation--the reliable, efficient, and reproducible execution of linked sets of actions on scientific instruments, computers, data stores, and other resources--has emerged as an essential element of modern science. We report here on new services within the Globus research data management platform that enable the specification of diverse research processes as reusable sets of actions, flows, and the execution of such flows in heterogeneous research environments. To support flows with broad spatial extent (e.g., from scientific instrument to remote data center) and temporal extent (from seconds to weeks), these Globus automation services feature: 1) cloud hosting for reliable execution of even long-lived flows despite sporadic failures; 2) a declarative notation, and extensible asynchronous action provider API, for defining and executing a wide variety of actions and flow specifications involving arbitrary resources; 3) authorization delegation mechanisms for secure invocation of actions. These services permit researchers to outsource and automate the management of a broad range of research tasks to a reliable, scalable, and secure cloud platform. We present use cases for Globus automation services, describe the design and implementation of the services, present microbenchmark studies, and review experiences applying the services in a range of applications
翻译:我们在此报告Globus研究数据管理平台内的新服务,使各种研究进程成为可重复使用的行动、流动和在不同研究环境中执行这种流动的系统; 为了支持广泛的空间范围(例如从科学仪器到远程数据中心)和时间范围(从几秒钟到几周)的流动,这些Globus自动化服务包括:1) 云宿托管,以便在零星失败的情况下可靠地进行甚至长期的流动;2) 声明性注解,以及可扩展的无同步行动提供方API,以界定和执行涉及任意资源的范围广泛的各种行动和流动规范;3) 授权授权授权机制,以安全地采取行动; 这些服务使研究人员能够将广泛的研究任务管理外包和自动化到一个可靠、可缩放和安全的云平台。