The multi-armed bandit (MAB) problem models a decision-maker that optimizes its actions based on current and acquired new knowledge to maximize its reward. This type of online decision is prominent in many procedures of Brain-Computer Interfaces (BCIs) and MAB has previously been used to investigate, e.g., what mental commands to use to optimize BCI performance. However, MAB optimization in the context of BCI is still relatively unexplored, even though it has the potential to improve BCI performance during both calibration and real-time implementation. Therefore, this review aims to further introduce MABs to the BCI community. The review includes a background on MAB problems and standard solution methods, and interpretations related to BCI systems. Moreover, it includes state-of-the-art concepts of MAB in BCI and suggestions for future research.
翻译:多武装土匪问题模型(MAB)是一个决策者,根据当前和获得的新知识优化其行动,以获得最大回报,这种在线决定在脑-计算机接口(BCI)和MAB的许多程序中占有突出地位,以前曾用于调查,例如,利用何种精神指令优化BCI性能,但是,在BCI范围内,MAB优化仍然相对没有被探索,尽管在校准和实时实施期间,它有可能改进BCI的性能,因此,本审查旨在进一步向BCI社区介绍MAB,审查包括关于MAB问题和标准解决办法的背景,以及与BCI系统有关的解释,此外,它包括BCI中MAB的最新概念和对未来研究的建议。