With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.
翻译:随着深层学习的迅速发展,为多个下游任务培训大模型(BMs)成为流行的范例,研究人员在构建BMs和在许多领域应用BM方面取得了各种成果,目前缺乏能够区分BMs总体进展并指导后续研究的研究工作。在本文中,我们不仅涵盖BM技术本身,而且涵盖BM培训和与BM公司一起应用BM培训的先决条件,将BM审查分为四个部分:资源、模型、关键技术和应用。我们在这四个部分引入了16个与BM有关的特定专题:数据、知识、计算机系统、平行培训系统、语言模型、愿景模型、多模式模型、理论和易用性、常识性、合理性、可靠性和安全性、治理、评价、机器翻译、文本生成、对话和Protein研究。在每一个专题中,我们清楚地总结目前的研究,并提出一些未来的研究方向。在本文件的结尾,我们从更笼统的角度总结了BMs的进一步发展。