Reasoning under uncertainty is a key challenge in AI, especially for real-world tasks, where problems with sparse data demands systematic generalisation. Existing approaches struggle to balance accuracy and simplicity when evaluating multiple candidate solutions. We propose a Solomonoff-inspired method that weights LLM-generated hypotheses by simplicity and predictive fit. Applied to benchmark (Mini-ARC) tasks, our method produces Solomonoff-weighted mixtures for per-cell predictions, yielding conservative, uncertainty-aware outputs even when hypotheses are noisy or partially incorrect. Compared to Bayesian Model Averaging (BMA), Solomonoff scoring spreads probability more evenly across competing hypotheses, while BMA concentrates weight on the most likely but potentially flawed candidates. Across tasks, this highlights the value of algorithmic information-theoretic priors for interpretable, reliable multi-hypothesis reasoning under uncertainty.
翻译:不确定性推理是人工智能领域的关键挑战,尤其在现实任务中,稀疏数据问题要求系统具备系统性泛化能力。现有方法在评估多个候选解时难以平衡准确性与简洁性。我们提出一种受所罗门诺夫启发的算法,通过简洁性和预测拟合度对LLM生成的假设进行加权。该方法应用于基准测试(Mini-ARC)任务时,通过所罗门诺夫加权混合生成逐单元预测,即使在假设存在噪声或部分错误的情况下,仍能产生保守且具有不确定性感知的输出。与贝叶斯模型平均(BMA)相比,所罗门诺夫评分将概率更均匀地分布在竞争假设之间,而BMA则将权重集中在最可能但存在潜在缺陷的候选假设上。跨任务实验表明,算法信息论先验对于构建可解释、可靠的不确定性多假设推理体系具有重要价值。