Continuous evolution in modern software often causes documentation, tutorials, and examples to be out of sync with changing interfaces and frameworks. Relying on outdated documentation and examples can lead programs to fail or be less efficient or even less secure. In response, programmers need to regularly turn to other resources on the web such as StackOverflow for examples to guide them in writing software. We recognize that this inconvenient, error-prone, and expensive process can be improved by using machine learning applied to software usage data. In this paper, we present our practical system which uses machine learning on large-scale telemetry data and documentation corpora, generating appropriate and complex examples that can be used to improve documentation. We discuss both feature-based and transformer-based machine learning approaches and demonstrate that our system achieves 100% coverage for the used functionalities in the product, providing up-to-date examples upon every release and reduces the numbers of PRs submitted by software owners writing and editing documentation by >68%. We also share valuable lessons learnt during the 3 years that our production quality system has been deployed for Azure Cloud Command Line Interface (Azure CLI).
翻译:现代软件的持续演变往往导致文件、辅导和实例与变化中的界面和框架不同步。 依靠过时的文件和实例可能导致程序失败或效率低,甚至更不安全。 作为回应,程序员需要定期转向网络上的其他资源,如StackOverflow, 以书面软件为例。 我们认识到,使用软件使用数据应用的机器学习,可以改进这种不方便、容易出错和昂贵的过程。 在本文中,我们介绍了我们的实用系统,该系统利用机器学习大型遥测数据和文件公司,生成适当和复杂的实例,可用于改进文件。我们讨论了基于地貌的和基于变压器的机器学习方法,并表明我们的系统实现了产品使用功能的100%覆盖,提供了每次发布的最新实例,并减少了软件所有人以>68%的用户撰写和编辑文件的数量。我们还分享了在3年中获得的宝贵经验教训,即我们的生产质量系统已经用于Azure Cloud指挥线接口(Azure CLI) 。