Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.
翻译:尽管近年来机械学习技术有了进步,但尽管近年来机器学习技术有了进步,最先进的系统对“现实世界”事件缺乏活力,所部署的系统的投入分配和任务将不限于最初的培训背景,而系统需要适应新颖的分布和部署期间的任务。这一关键差距可以通过开发“终身学习”系统加以解决,该系统能够(1) 继续学习,(2) 转让和适应,(3) 伸缩性。不幸的是,改进这些能力的努力通常被视为独立评估的不同研究领域,而没有考虑到每个单独能力对系统其他方面的影响。我们提议采取综合办法,使用一套计量和评价框架评估终身学习,以对具体领域或系统技术有原则的方式评估终身学习。我们通过五个案例研究表明,这套计量系统能够为各种复杂的终身学习系统的发展提供信息。我们强调,拟议的成套计量系统如何在生命学习系统开发过程中对业绩的取舍取取偏偏,这是广泛讨论的稳定性-弹性两难问题,也是我们为未来学习进展而新提出的指南。我们进一步评估了定式和定式学习系统之间的关系。