Faithfulness measures whether chain-of-thought (CoT) representations accurately reflect a model's decision process and can be used as reliable explanations. Prior work has shown that CoTs from text-based LLMs are often unfaithful. This question has not been explored for large audio-language models (LALMs), where faithfulness is critical for safety-sensitive applications. Reasoning in LALMs is also more challenging, as models must first extract relevant clues from audio before reasoning over them. In this paper, we investigate the faithfulness of CoTs produced by several LALMs by applying targeted interventions, including paraphrasing, filler token injection, early answering, and introducing mistakes, on two challenging reasoning datasets: SAKURA and MMAR. After going through the aforementioned interventions across several datasets and tasks, our experiments suggest that, LALMs generally produce CoTs that appear to be faithful to their underlying decision processes.
翻译:忠实性衡量思维链表示是否准确反映模型的决策过程,并可作为可靠的解释依据。先前研究表明,基于文本的大型语言模型生成的思维链往往缺乏忠实性。该问题尚未在大型音频语言模型中得到探索,而在此类模型中,忠实性对安全敏感型应用至关重要。LALM的推理过程更具挑战性,因为模型必须先从音频中提取相关线索,再进行推理分析。本文通过针对性干预方法——包括释义改写、填充符注入、提前作答及错误引入——在两个具有挑战性的推理数据集SAKURA和MMAR上,对多种LALM生成的思维链进行忠实性研究。经过跨多数据集与多任务的上述干预实验,结果表明LALM生成的思维链总体上与其底层决策过程具有忠实性。