The integration of artificial intelligence capabilities into modern software systems is increasingly being simplified through the use of cloud-based machine learning services and representational state transfer architecture design. However, insufficient information regarding underlying model provenance and the lack of control over model evolution serve as an impediment to the more widespread adoption of these services in many operational environments which have strict security requirements. Furthermore, tools such as TensorFlow Serving allow models to be deployed as RESTful endpoints, but require error-prone transformations for PyTorch models as these dynamic computational graphs. This is in contrast to the static computational graphs of TensorFlow. To enable rapid deployments of PyTorch models without intermediate transformations we have developed FlexServe, a simple library to deploy multi-model ensembles with flexible batching.
翻译:将人工智能能力纳入现代软件系统,正越来越多地通过使用云基机器学习服务和代表式国家传输结构设计而简化,然而,关于基本模型出处的信息不足和对模型演变缺乏控制,阻碍了在许多具有严格安全要求的业务环境中更广泛地采用这些服务。此外,TensorFlow Service等工具允许将模型作为REST端点进行部署,但需要将PyToirch模型作为这些动态计算图进行易出错的转换。这与TensorFlow的静态计算图不同。为了能够在没有中间转换的情况下迅速部署PyTorrch模型,我们开发了FlexServe,这是一个简单的图书馆,可以部署具有灵活批量的多模型组合。