Range aggregate queries (RAQs) are an integral part of many real-world applications, where, often, fast and approximate answers for the queries are desired. Recent work has studied answering RAQs using machine learning models, where a model of the data is learned to answer the queries. However, such modelling choices fail to utilize any query specific information. To capture such information, we observe that RAQs can be represented by query functions, which are functions that take a query instance (i.e., a specific RAQ) as an input and output its corresponding answer. Using this representation, we formulate the problem of learning to approximate the query function, and propose NeuroDB, a query specialized neural network framework, that answers RAQs efficiently. NeuroDB is query-type agnostic (i.e., it does not make any assumption about the underlying query type) and our observation that queries can be represented by functions is not specific to RAQs. Thus, we investigate whether NeuroDB can be used for other query types, by applying it to distance to nearest neighbour queries. We experimentally show that NeuroDB outperforms the state-of-the-art for this query type, often by orders of magnitude. Moreover, the same neural network architecture as for RAQs is used, bringing to light the possibility of using a generic framework to answer any query type efficiently.
翻译:范围汇总查询(RAQs)是许多真实世界应用软件的一个组成部分,其中往往需要快速和近似地回答查询。最近的工作已经研究使用机器学习模型回答RAQs的问题,并学习了数据模型来回答查询。然而,这种建模选择没有利用任何询问特定信息。为了捕捉这些信息,我们注意到,RAQs可以由查询功能来代表,这些功能是使用查询实例(即特定的RAQ)作为输入和输出相应答案的一种功能。我们使用这种表达方式,提出学习近似查询功能的问题,并提议NeuroDB(查询专用神经网络框架)来高效回答RAQs。NeuroDB(NeuroDB)是查询类型(即它不会对基本查询类型做出任何假设),而我们关于查询可以通过功能来代表查询的观察功能并不是RAQs的具体功能。因此,我们调查NeuroDB是否可以用于其他查询类型,将其应用到最近的邻居查询中。我们实验显示,NeuroDB(查询专用神经网络框架)超越了RAQs 类型中的任何光度结构,通常使用相同的普通结构。