Containerization plays a crucial role in the de facto technology stack for implementing microservices architecture (each microservice has its own database in most cases). Nevertheless, there are still fierce debates on containerizing production databases, mainly due to the data persistence issues and concerns. Driven by a project of refactoring an Automated Machine Learning system, this research proposes the container-native data persistence as a conditional solution to running database containers in production. In essence, the proposed solution distinguishes the stateless data access (i.e. reading) from the stateful data processing (i.e. creating, updating, and deleting) in databases. A master database handles the stateful data processing and dumps database copies for building container images, while the database containers will keep stateless at runtime, based on the preloaded dump in the image. Although there are delays in the state/image update propagation, this solution is particularly suitable for the read-only, the eventual consistency, and the asynchronous processing scenarios. Moreover, with optimal tuning (e.g., disabling locking), the portability and performance gains of a read-only database container would outweigh the performance loss in accessing data across the underlying image layers.
翻译:在实际技术堆堆堆中,集装箱化在实施微观服务架构方面起着关键作用(每个微观服务都拥有自己的数据库);然而,关于集装箱化生产数据库的辩论仍然激烈,这主要是由于数据持续存在的问题和关切。在对自动机器学习系统进行再演化的项目的推动下,这项研究提出集装箱-本地数据持久性是运行生产中数据库容器的一个有条件解决办法。实质上,拟议的解决方案将无国籍数据的存取(即读读)与数据库中的状态式数据处理(即创建、更新和删除)区分开来。一个主数据库处理用于建造集装箱图像的状态数据处理和倾弃数据库副本,而数据库容器将在运行时根据图像中预先加载的倾弃保持无国籍状态。虽然州/图像更新传播工作出现延误,但这一解决方案特别适合只读、最终一致性和不连贯的处理设想。此外,通过最佳的调整(例如拆锁),一个可读数据库的可转移性和性收益将超过图像在进入的层次上的业绩损失。