The field of warehouse robotics is currently in high demand, with major technology and logistics companies making significant investments in these advanced systems. Training robots to operate in such complex environments is challenging, often requiring human supervision for adaptation and learning. Interactive reinforcement learning (IRL) is a key training methodology in human-computer interaction. This paper presents a comparative study of two IRL algorithms: Q-learning and SARSA, both trained in a virtual grid-simulation-based warehouse environment. To maintain consistent feedback rewards and avoid bias, feedback was provided by the same individual throughout the study.
翻译:暂无翻译