Significant efforts have been expended in the research and development of a database management system (DBMS) that has a wide range of applications for managing an enormous collection of multisource, heterogeneous, complex, or growing data. Besides the primary function (i.e., create, delete, and update), a practical and impeccable DBMS can interact with users through information selection, that is, querying with their targets. Previous querying algorithms, such as frequent itemset querying and sequential pattern querying (SPQ) have focused on the measurement of frequency, which does not involve the concept of utility, which is helpful for users to discover more informative patterns. To apply the querying technology for wider applications, we incorporate utility into target-oriented SPQ and formulate the task of targeted utility-oriented sequence querying. To address the proposed problem, we develop a novel algorithm, namely targeted high-utility sequence querying (TUSQ), based on two novel upper bounds suffix remain utility and terminated descendants utility as well as a vertical Last Instance Table structure. For further efficiency, TUSQ relies on a projection technology utilizing a compact data structure called the targeted chain. An extensive experimental study conducted on several real and synthetic datasets shows that the proposed algorithm outperformed the designed baseline algorithm in terms of runtime, memory consumption, and candidate filtering.
翻译:在数据库管理系统(DBMS)的研发方面已作出重大努力,该系统拥有管理大量多源、多种、复杂或不断增长的数据的大量收集的广泛应用,除了主要功能(即创建、删除和更新)外,实用和无懈可击的DBMS还可以通过信息选择,即与其目标查询,与用户互动。以往的查询算法,如频繁的项目集查询和顺序模式查询(SPQ),侧重于频率的测量,这不涉及实用概念,因为实用概念有助于用户发现更多的信息模式。为了将查询技术应用于更广泛的应用,我们把实用性纳入面向目标的SPQ,并制定有针对性的面向用途的序列查询任务。为了解决拟议的问题,我们开发了一种新型算法,即有针对性的高功用序列查询(TUSQ),基于两个新的上层,仍然有用性终止后代的效用以及纵向最后试算法结构。为了进一步的效率,TUSQ依靠一种使用紧凑数据技术的预测技术,即称之为目标型SPQ,我们在目标型SPQ SPQ 中设计了目标式的数据筛选器,我们进行了广泛的实验性研究,在模拟的模型分析。