The ubiquity of computation in modern scientific research inflicts new challenges for reproducibility. While most journals now require code and data be made available, the standards for organization, annotation, and validation remain lax, making the data and code often difficult to decipher or practically use. I believe that this is due to the documentation, collation, and validation of code and data only being done in retrospect. In this essay, I reflect on my experience contending with these challenges and present a philosophy for prioritizing reproducibility in modern biological research where balancing computational analysis and wet-lab experiments is commonplace. Modern tools used in scientific workflows (such as GitHub repositories) lend themselves well to this philosophy where reproducibility begins at project inception, not completion. To that end, I present and provide a programming-language agnostic template architecture that can be immediately copied and made bespoke to your next paper, whether your lab work is wet, dry, or somewhere in between.
翻译:现代科学研究的计算普遍存在给再生带来了新的挑战。 虽然大多数期刊现在都要求提供代码和数据,但组织、批注和验证的标准仍然不严谨,使得数据和代码往往难以解译或实际使用。我认为,这是因为对代码和数据的文件、整理和验证只能在回视中进行。在这份论文中,我反思我与这些挑战作斗争的经验,并提出了在现代生物学研究中优先考虑再生的理念,在现代生物学研究中,平衡计算分析和湿实验室实验是司空见惯的。科学工作流程(如GitHub储存库)中使用的现代工具非常适合这种理念,在项目启动之初,而不是完成阶段,即开始复制。为此,我展示并提供一种编程语言的模版结构,可以立即复制,然后对你的下一个论文说,不管你的实验室工作是湿的、干燥的,还是介于两者之间的某处。