Data processing and analytics are fundamental and pervasive. Algorithms play a vital role in data processing and analytics where many algorithm designs have incorporated heuristics and general rules from human knowledge and experience to improve their effectiveness. Recently, reinforcement learning, deep reinforcement learning (DRL) in particular, is increasingly explored and exploited in many areas because it can learn better strategies in complicated environments it is interacting with than statically designed algorithms. Motivated by this trend, we provide a comprehensive review of recent works focusing on utilizing DRL to improve data processing and analytics. First, we present an introduction to key concepts, theories, and methods in DRL. Next, we discuss DRL deployment on database systems, facilitating data processing and analytics in various aspects, including data organization, scheduling, tuning, and indexing. Then, we survey the application of DRL in data processing and analytics, ranging from data preparation, natural language processing to healthcare, fintech, etc. Finally, we discuss important open challenges and future research directions of using DRL in data processing and analytics.
翻译:数据处理和分析是基本和普遍的。 算术在数据处理和分析中发挥着关键作用,因为许多算法设计包括了人类知识和经验的理论和一般规则,以提高其效力。最近,在许多领域越来越多地探索和利用强化学习、深强化学习(DRL),特别是最近,由于它可以在复杂的环境中学习更好的战略,而它与静态设计的算法相互作用。受这一趋势的驱动,我们全面审查了最近的工作,重点是利用DRL改进数据处理和分析。首先,我们介绍了DRL的关键概念、理论和方法。接下来,我们讨论了DRL在数据库系统中的部署,便利数据处理和各方面的分析,包括数据组织、时间安排、调整和索引编制。然后,我们调查DRL在数据处理和分析方面的应用情况,从数据编制、自然语言处理到保健、fintech等等。最后,我们讨论了在数据处理和分析中使用DRL的重要公开挑战和未来研究方向。