This paper systematically explores how Large Language Models (LLMs) generate explanations of code examples of the type used in intro-to-programming courses. As we show, the nature of code explanations generated by LLMs varies considerably based on the wording of the prompt, the target code examples being explained, the programming language, the temperature parameter, and the version of the LLM. Nevertheless, they are consistent in two major respects for Java and Python: the readability level, which hovers around 7-8 grade, and lexical density, i.e., the relative size of the meaningful words with respect to the total explanation size. Furthermore, the explanations score very high in correctness but less on three other metrics: completeness, conciseness, and contextualization.
翻译:暂无翻译