Biological science produces large amounts of data in a variety of formats, which necessitates the use of computational tools to process, integrate, analyse, and glean insights from the data. Researchers who use computational biology tools range from those who use computers primarily for communication and data lookup, to those who write complex software programs in order to analyse data or make it easier for others to do so. This research examines how people differ in how they conceptualise the same data, for which we coin the term "subjective data models". We interviewed 22 people with biological experience and varied levels of computational experience to elicit their perceptions of the same subset of biological data entities. The results suggest that many people had fluid subjective data models that would change depending on the circumstance or tool they were using. Surprisingly, results generally did not seem to cluster around a participant's computational experience/education levels, or the lack thereof. We further found that people did not consistently map entities from an abstract data model to the same identifiers in real-world files, and found that certain data identifier formats were easier for participants to infer meaning from than others. Real-world implications of these findings suggests that 1) software engineers should design interfaces for task performance and emulate other related popular user interfaces, rather than targeting a person's professional background; 2) when insufficient context is provided, people may guess what data means, whether or not their guesses are correct, emphasising the importance of providing contextual metadata when preparing data for re-use by other, to remove the need for potentially erroneous guesswork.
翻译:生物科学以多种格式产生大量数据,这需要使用计算工具来处理、整合、分析和收集数据中的洞察力。使用计算生物学工具的研究人员从主要使用计算机进行通信和数据调查的研究人员到为分析数据或使其他人更容易这样做而编写复杂软件程序的人,从主要使用计算机进行通信和数据调查的研究人员到为分析数据而编写复杂的软件程序的人,生物科学以各种不同的格式产生大量数据。这一研究考察了人们如何不同地将同一数据的概念化,我们为此创造了“主观数据模型”。我们采访了22个具有生物经验和不同水平的计算经验的人,以了解他们对生物数据实体同一组实体的看法。结果显示,许多人的主观数据模型将随着他们使用的情况或工具的变化而变化。 令人惊讶的是,结果的结果一般似乎没有围绕参与者的计算经验/教育水平或缺乏这样做。 我们还发现,人们没有一贯地将实体从抽象数据模型到现实世界文件中的相同识别数据模型,并且发现某些数据识别格式对参与者来说比其他人更容易理解。 现实世界的情况影响并不重要,当这些调查结论的准确性影响时, 也就是人们是否应该提供其真实性分析,而可能用来解释性分析其他数据界面,那么,而人们的判断性分析性分析工作应该提供不正确性分析结果,而应该提供不正确性的数据界面,而应该提供其他的判断性分析。