NIERT: 使用变换编码器通过统一散装数据表示法进行精确数字内插 (NIERT: Accurate Numerical Interpolation through Unifying Scattered Data Representations using Transformer Encoder)

Numerical interpolation for scattered data aims to estimate values for target points based on those of some observed points. Traditional approaches produce estimations through constructing an interpolation function that combines multiple basis functions. These approaches require the basis functions to be pre-defined explicitly, thus greatly limiting their applications in practical scenarios. Recent advances exhibit an alternative strategy that learns interpolation functions directly from observed points using machine learning techniques, say deep neural networks. This strategy, although promising, cannot effectively exploit the correlations between observed points and target points as it treats these types of points separately. Here, we present a learning-based approach to numerical interpolation using encoder representations of Transformers (thus called NIERT). NIERT treats the value of each target point as a masked token, which enables processing target points and observed points in a unified fashion. By calculating the partial self-attention between target points and observed points at each layer, NIERT gains advantages of exploiting the correlations among these points and, more importantly, avoiding the unexpected interference of target points on observed points. NIERT also uses the pre-training technique to further improve its accuracy. On three representative datasets, including two synthetic datasets and a real-world dataset, NIERT outperforms the existing approaches, e.g., on the TFRD-ADlet dataset for temperature field reconstruction, NIERT achieves an MAE of $1.897\times 10^{-3}$, substantially better than the transformer-based approach (MAE: $27.074\times 10^{-3}$). These results clearly demonstrate the accuracy of NIERT and its potential to apply in multiple practical fields.

翻译：分散数据的数值内插法旨在根据观测到的点数估算目标点值。传统方法通过构建一个结合多种基础功能的内插函数来进行估算。这些方法要求对基函数进行预先界定,从而在实际情景中大大限制其应用。最近的进展展示了一种替代战略,利用机器学习技术直接从观察点中学习内插功能,比如深神经网络。这一战略虽然有希望,但无法有效地利用观察点和目标点之间的关联,因为它分别处理这些类型的点。在这里,我们提出一种基于学习的内插法,利用变异器(Thous称为NIERT)的编码显示对数值的内插法。NIERT将每个目标点的价值当作一个掩码,以便统一处理目标点和观察点。通过计算目标点和观察点之间的部分自留状态,NIERT在利用这些点之间的关联性,更重要的是,避免观测到的目标点的意外干扰。NIERT还利用培训前的多数值内插法,以更精确性地改进了它的真实性值。关于三个具有代表性的数据方法,包括:IMAR10的实地数据分析,10的实地数据。