In-context learning (ICL) suffers from oversensitivity to the prompt, making it unreliable in real-world scenarios. We study the sensitivity of ICL with respect to multiple perturbation types. First, we find that label bias obscures the true sensitivity, and therefore prior work may have significantly underestimated ICL sensitivity. Second, we observe a strong negative correlation between ICL sensitivity and accuracy: predictions sensitive to perturbations are less likely to be correct. Motivated by these findings, we propose \textsc{SenSel}, a few-shot selective prediction method that abstains from sensitive predictions. Experiments on ten classification datasets show that \textsc{SenSel} consistently outperforms two commonly used confidence-based and entropy-based baselines on abstention decisions.
翻译:直截了当的学习(ICL)过于敏感, 使得它在现实世界中变得不可靠。 我们研究了ICL在多重扰动类型方面的敏感性。 首先, 我们发现标签上的偏差掩盖了真实的敏感性, 因此先前的工作可能大大低估了ICL的敏感性。 第二, 我们观察到ICL的敏感性和准确性之间存在强烈的负相关关系: 敏感于扰动的预测不太可能是正确的。 在这些结果的推动下, 我们提议了\ textsc{SenSel}, 这是一种从敏感预测中放弃的几发选择性预测方法。 对十种分类数据集的实验显示, \ textsc{SenSel} 在弃权决定上, 持续超过两种常用的基于信任和基于酶的基线。