在脊柱玻璃地上测试的深层强化学习惯性:大图 (Deep reinforced learning heuristic tested on spin-glass ground states: The larger picture)

In Changjun Fan et al. [Nature Communications https://doi.org/10.1038/s41467-023-36363-w (2023)], the authors present a deep reinforced learning approach to augment combinatorial optimization heuristics. In particular, they present results for several spin glass ground state problems, for which instances on non-planar networks are generally NP-hard, in comparison with several Monte Carlo based methods, such as simulated annealing (SA) or parallel tempering (PT). Indeed, those results demonstrate that the reinforced learning improves the results over those obtained with SA or PT, or at least allows for reduced runtimes for the heuristics before results of comparable quality have been obtained relative to those other methods. To facilitate the conclusion that their method is ''superior'', the authors pursue two basic strategies: (1) A commercial GUROBI solver is called on to procure a sample of exact ground states as a testbed to compare with, and (2) a head-to-head comparison between the heuristics is given for a sample of larger instances where exact ground states are hard to ascertain. Here, we put these studies into a larger context, showing that the claimed superiority is at best marginal for smaller samples and becomes essentially irrelevant with respect to any sensible approximation of true ground states in the larger samples. For example, this method becomes irrelevant as a means to determine stiffness exponents $\theta$ in $d>2$, as mentioned by the authors, where the problem is not only NP-hard but requires the subtraction of two almost equal ground-state energies and systemic errors in each of $\approx 1\%$ found here are unacceptable. This larger picture on the method arises from a straightforward finite-size corrections study over the spin glass ensembles the authors employ, using data that has been available for decades.

翻译：在Chanjun Fan 等人(https://doi.org/10.1038/s41467-023-363-w (2023))中,作者们展示了一种强化的学习方法,以强化组合式优化超常理论。特别是,他们展示了几个旋转玻璃地面状态问题的结果,在这些问题上,非平板网络的情况一般都是NP-hard,而基于Monte Carlo的几种方法,如模拟肛交(SA)或平行调情(PT)。事实上,这些结果表明,强化的学习改善了在用SA或PT(P)获得的结果上取得的结果,或至少允许在获得可比质量结果之前减少超常理论的运行时间。为了便于得出其方法为“超强”的结论,作者们采取了两种基本策略:(1) 一个商业的GUROBI求解答器,用来采集精确地面状态的样本,用来进行比较。(2) 一个头到头的对比,但头对头的对比, 而不是直接的错误,至少可以让更多的作者们看到一个更精确的直值, 在地面的样本中,我们找到一个最接近的直基的直基的直基的基的模型。

相关内容

SPIN

关注 0

第26届SPIN研讨会旨在将对软件分析和软件模型自动化工具技术感兴趣的研究人员和实践者聚集在一起，以进行验证和确认。研讨会特别关注并发软件，但不排除对顺序软件的分析。提交的资料包括理论结果、新算法、工具开发和经验评估。官网链接：https://conf.researchr.org/track/spin-2019/spin-2019-papers

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日