With the advantages that cloud computing offers in terms of platform as a service, software as a service, and infrastructure as a service, data engineers and data scientists are able to leverage cloud computing for their ETL/ELT (extract, transform and load) and ML (machine learning) requirements and deployments. The proposed framework for the comparative review of cloud computing platforms for data science workflows uses an amalgamation of the analytical hierarchy process, Saaty's fundamental scale of absolute numbers, and a selection of relevant evaluation criteria (namely: automation, error handling, fault tolerance, performance quality, unit testing, data encryption, monitoring, role based access, security, availability, ease of use, integration and interoperability). The framework enables users to evaluate criteria pertaining to cloud platforms for data science workflows, and additionally is able to recommend which cloud platform would be suitable for the user based on the relative importance of the above criteria. Evaluations of the criteria are shown to be consistent and thus the weighting of criteria against the goal of cloud service provider or cloud platform selection are sensible. The proposed framework is robust enough to accommodate for changes in criteria and alternatives, depending on user cloud platform requirements and scope of cloud platform selection.
翻译:云计算在平台服务、软件服务、基础设施服务等方面提供了优势,数据工程师和数据科学家能够利用云计算来达到其ETL/ELT(提取、变换和负荷)和ML(机器学习)的要求和部署。云计算平台数据科学工作流程比较审查拟议框架利用了分析等级过程的结合,Saaty绝对数字的基本规模,并选择了相关的评价标准(即:自动化、错误处理、过错容忍度、性能质量、单位测试、数据加密、监测、基于作用的准入、安全性、可用性、可用性、易用、整合和互操作性)。框架使用户能够评估数据科学工作流程的云平台标准,并能够根据上述标准的相对重要性,就哪些云平台适合用户使用提出建议。对标准的评价显示是一致的,因此根据云服务供应商或云平台选择的目标对标准进行加权是明智的。拟议框架足够健全,足以适应根据用户云层平台的要求和范围在标准和替代品方面的变化。