One of the limitations of large language models is that they do not have access to up-to-date, proprietary or personal data. As a result, there are multiple efforts to extend language models with techniques for accessing external data. In that sense, LLMs share the vision of data integration systems whose goal is to provide seamless access to a large collection of heterogeneous data sources. While the details and the techniques of LLMs differ greatly from those of data integration, this paper shows that some of the lessons learned from research on data integration can elucidate the research path we are conducting today on language models.
翻译:大型语言模型的一个限制是它们无法访问最新的、专有的或个人数据。因此,有多项努力将语言模型与访问外部数据的技术扩展。在这种意义上,LLM共享数据集成系统的愿景,其目标是为大量异构数据源提供无缝访问。虽然LLM的细节和技术与数据集成有很大不同,但本文表明,从数据集成研究中学到的一些经验教训可以阐明我们今天在语言模型上开展的研究路径。