Training and Hyperparameter Optimization (HPO) of deep learning-based AI models are often compute resource intensive and calls for the use of large-scale distributed resources as well as scalable and resource efficient hyperparameter search algorithms. This work studies the potential of using model performance prediction to aid the HPO process carried out on High Performance Computing systems. In addition, a quantum annealer is used to train the performance predictor and a method is proposed to overcome some of the problems derived from the current limitations in quantum systems as well as to increase the stability of solutions. This allows for achieving results on a quantum machine comparable to those obtained on a classical machine, showing how quantum computers could be integrated within classical machine learning tuning pipelines. Furthermore, results are presented from the development of a containerized benchmark based on an AI-model for collision event reconstruction that allows us to compare and assess the suitability of different hardware accelerators for training deep neural networks.
翻译:深度学习为基础的 AI 模型的训练和超参数优化常常需要大量计算资源,并需要使用可扩展,资源高效的超参数搜索算法。本研究探讨了使用模型性能预测来辅助高性能计算系统中的 HPO 流程的潜力。此外,使用量子退火器来训练性能预测器,并提出了一种方法来克服当前量子系统存在的一些问题,并增加解决方案的稳定性。这允许在量子计算机上获得与经典机器相似的结果,显示出如何在经典机器学习调整流程中整合量子计算机。此外,我们还介绍了基于 AI-模型的容器化基准测试,用于碰撞事件重构,可以比较和评估不同硬件加速器训练深度神经网络的适用性。