统计电子计算的最佳做法 (Best Practices in Statistical Computing)

The world is becoming increasingly complex, both in terms of the rich sources of data we have access to as well as in terms of the statistical and computational methods we can use on those data. These factors create an ever-increasing risk for errors in our code and sensitivity in our findings to data preparation and execution of complex statistical and computing methods. The consequences of coding and data mistakes can be substantial. Openness (e.g., providing others with data code) and transparency (e.g., requiring that data processing and code follow standards) are two key solutions to help alleviate concerns about replicability and errors. In this paper, we describe the key steps for implementing a code quality assurance (QA) process for researchers to follow to improve their coding practices throughout a project to assure the quality of the final data, code, analyses and ultimately the results. These steps include: (i) adherence to principles for code writing and style that follow best practices, (ii) clear written documentation that describes code, workflow and key analytic decisions; (iii) careful version control, (iv) good data management; and (iv) regular testing and review. Following all these steps will greatly improve the ability of a study to assure results are accurate and reproducible. The responsibility for code QA falls not only on individual researchers but institutions, journals, and funding agencies as well.

翻译：无论是从我们能够获得的丰富的数据来源来看,还是从我们能够使用的统计和计算方法来看,世界正变得越来越复杂,无论是从我们能够获取的丰富的数据来源来看,还是从我们能够使用的统计和计算方法来看,这些因素都使我们的编码错误的风险不断增加,而且我们对编制和实施复杂的统计和计算方法的发现敏感度日益增大,编码和数据错误的后果可能很大,开放性(例如,向他人提供数据编码)和透明度(例如,要求数据处理和编码遵循标准)是帮助减轻对可复制性和错误的关切的两个主要解决办法,在本文件中,我们描述了执行守则质量保证(QA)的关键步骤,供研究人员在整个项目中改进编码的编码做法,以确保最终数据、编码、分析和最终结果的质量,这些步骤包括:(一) 遵守守则编写和风格的原则,遵循最佳做法,(二) 明确的书面文件,说明守则、工作流程和关键分析决定;(三) 仔细的版本控制,(四) 良好的数据管理;以及(四) 定期测试和审查,在确保最后数据、代码、代码和最终结果方面,只有确保各机构的准确性能力。