Large Language Models (LLMs) have demonstrated effectiveness as zero-shot time series (TS) forecasters. The key challenge lies in tokenizing TS data into textual representations that align with LLMs' pre-trained knowledge. While existing work often relies on fine-tuning specialized modules to bridge this gap, a distinct, yet challenging, paradigm aims to leverage truly off-the-shelf LLMs without any fine-tuning whatsoever, relying solely on strategic tokenization of numerical sequences. The performance of these fully frozen models is acutely sensitive to the textual representation of the input data, as their parameters cannot adapt to distribution shifts. In this paper, we introduce a simple yet highly effective strategy to overcome this brittleness: injecting noise into the raw time series before tokenization. This non-invasive intervention acts as a form of inference-time augmentation, compelling the frozen LLM to extrapolate based on robust underlying temporal patterns rather than superficial numerical artifacts. We theoretically analyze this phenomenon and empirically validate its effectiveness across diverse benchmarks. Notably, to fully eliminate potential biases from data contamination during LLM pre-training, we introduce two novel TS datasets that fall outside all utilized LLMs' pre-training scopes, and consistently observe improved performance. This study provides a further step in directly leveraging off-the-shelf LLMs for time series forecasting.
翻译:大型语言模型(LLMs)已展现出作为零样本时间序列(TS)预测器的有效性。核心挑战在于将时间序列数据转换为与LLMs预训练知识对齐的文本表示。现有研究通常依赖微调专用模块来弥合这一差距,而另一种独特且更具挑战性的范式旨在直接利用完全未经微调的现成LLMs,仅依赖于数值序列的策略性标记化。这些完全冻结的模型性能对输入数据的文本表示极为敏感,因为其参数无法适应分布偏移。本文提出一种简单而高效的策略来克服这种脆弱性:在标记化前向原始时间序列注入噪声。这种非侵入式干预作为一种推理时增强手段,迫使冻结的LLM基于鲁棒的基础时序模式而非表面数值伪影进行外推。我们从理论上分析了这一现象,并通过多样化基准实验验证了其有效性。值得注意的是,为彻底消除LLM预训练阶段数据污染可能带来的偏差,我们引入了两个超出所有使用LLMs预训练范围的新型时间序列数据集,并持续观察到性能提升。本研究为直接利用现成LLMs进行时间序列预测提供了进一步探索。