Recent breakthroughs based on big/foundation models reveal a vague avenue for AI, that is, \emph{bid data, big/foundation models, big learning, $\cdots$}. Following that avenue, here we elaborate on our newly introduced big learning. Specifically, big learning exhaustively exploits the information/tasks inherent in its large-scale \emph{complete/incomplete} training data, by learning to simultaneously model many-to-all joint/conditional/marginal data distributions (thus named big learning) with one universal foundation model. We reveal that big learning is what existing foundation models are implicitly doing; accordingly, our big learning provides high-level guidance for flexible design and improvements of foundation models. Besides, big learning ($i$) is equipped with great flexibilities for complete/incomplete training data and for customizing trustworthy data tasks; ($ii$) potentially delivers all joint/conditional/marginal data capabilities after training; ($iii$) significantly reduces the training-test gap with improved model generalization; and ($iv$) potentially unifies conventional machine learning paradigms and enables their flexible cooperations, manifesting a universal learning paradigm. Preliminary experiments verified the effectiveness of the presented big learning.
翻译:基于大/基础模型的最近突破表明,AI的渠道很模糊,即 \ emph{ bid数据、 big/found 模型、 大学习、 $\ cdots$}。在此之后,我们详述了我们新引入的大型学习。具体地说,大学习详尽地利用了大型培训数据中固有的信息/任务,学习同时用一个通用基础模型来模拟许多到所有联合/有条件/边际数据的分配(这叫大学习)。我们发现,大学习是现有基础模型暗含的工作;因此,我们大学习为灵活设计和改进基础模型提供了高级指导。此外,大学习($i)为完整/完整培训数据以及将可信赖的数据任务定制化提供了极大的灵活性; (ii) 在培训之后,有可能提供所有联合/有条件/边际数据能力; (iii) 大大缩小培训-测试差距,改进了模型的通用化; (iv) 可能不统一常规的机器学习模式,使其能够进行大型的实验。