Android has become the predominant smartphone operating system, with a rapidly evolving ecosystem that requires app developers to frequently update their apps to maintain quality, security, and compatibility. While deep learning has made significant strides in various software engineering tasks, including automated code updates, existing methods are not specifically tailored for Android apps, and the potential of pre-trained Language Models of Code (CodeLMs) for updating Android app code remains unexplored. In this paper, we present the first comprehensive evaluation of state-of-the-art CodeLMs, including CodeT5, CodeBERT, CodeGPT, and UniXcoder, for recommending code updates in Android applications. To facilitate this evaluation, we curate a unique dataset of paired updated methods from 3,195 Android apps published on Google Play and hosted on GitHub between 2008 and 2022. Our findings demonstrate that pre-trained CodeLMs outperform traditional approaches, achieving a higher accuracy ranging from 190% to 385% under a realistic time-wise evaluation scenario. Among the CodeLMs, CodeT5 consistently exhibits superior performance across most code update types. Furthermore, we examine the impact of update types, evaluation scenarios, method size, and update size on the performance of CodeLMs, revealing areas for future research to improve temporal adaptability and generalization capabilities.
翻译:暂无翻译