ML-增强的 DBMS 统一可转让模式 (A Unified Transferable Model for ML-Enhanced DBMS)

Recently, the database management system (DBMS) community has witnessed the power of machine learning (ML) solutions for DBMS tasks. Despite their promising performance, these existing solutions can hardly be considered satisfactory. First, these ML-based methods in DBMS are not effective enough because they are optimized on each specific task, and cannot explore or understand the intrinsic connections between tasks. Second, the training process has serious limitations that hinder their practicality, because they need to retrain the entire model from scratch for a new DB. Moreover, for each retraining, they require an excessive amount of training data, which is very expensive to acquire and unavailable for a new DB. We propose to explore the transferabilities of the ML methods both across tasks and across DBs to tackle these fundamental drawbacks. In this paper, we propose a unified model MTMLF that uses a multi-task training procedure to capture the transferable knowledge across tasks and a pre-train fine-tune procedure to distill the transferable meta knowledge across DBs. We believe this paradigm is more suitable for cloud DB service, and has the potential to revolutionize the way how ML is used in DBMS. Furthermore, to demonstrate the predicting power and viability of MTMLF, we provide a concrete and very promising case study on query optimization tasks. Last but not least, we discuss several concrete research opportunities along this line of work.

翻译：最近,数据库管理系统(DBMS)社区目睹了用于DBMS任务的机器学习(ML)解决方案的力量。尽管这些现有解决方案表现良好,但很难被认为是令人满意的。首先,DBMS中这些基于ML的方法不够有效,因为它们在每项具体任务上都是最佳的,无法探索或理解任务之间的内在联系。第二,培训过程存在严重限制,有碍其实际性,因为它们需要从头到尾重新培训整个模式,以建立一个新的DB。此外,对于每次再培训,它们都需要大量的培训数据,而对于新的DB来说,这些数据非常昂贵,难以获得和获得。我们提议探索ML方法在任务之间和跨DB中的脆弱性,以解决这些根本的缺陷。在本文件中,我们提出了一个统一的模型MTMLF培训程序,利用多任务培训程序获取跨任务之间的可转让知识,以及一个前的微调程序,以在整个DBBS中提取可转让的元知识。我们认为,这种模式更适合云化 DB服务,而且对于新的DB服务来说是非常昂贵的。我们提议探索ML方法的转让能力,但MLL是如何在具体研究中最有可能进行。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/