Conducting data analysis tasks rarely occur in isolation. Especially in intelligence analysis scenarios where different experts contribute knowledge to a shared understanding, members must communicate how insights develop to establish common ground among collaborators. The use of provenance to communicate analytic sensemaking carries promise by describing the interactions and summarizing the steps taken to reach insights. Yet, no universal guidelines exist for communicating provenance in different settings. Our work focuses on the presentation of provenance information and the resulting conclusions reached and strategies used by new analysts. In an open-ended, 30-minute, textual exploration scenario, we qualitatively compare how adding different types of provenance information (specifically data coverage and interaction history) affects analysts' confidence in conclusions developed, propensity to repeat work, filtering of data, identification of relevant information, and typical investigation strategies. We see that data coverage (i.e., what was interacted with) provides provenance information without limiting individual investigation freedom. On the other hand, while interaction history (i.e., when something was interacted with) does not significantly encourage more mimicry, it does take more time to comfortably understand, as represented by less confident conclusions and less relevant information-gathering behaviors. Our results contribute empirical data towards understanding how provenance summarizations can influence analysis behaviors.
翻译:进行数据分析的任务很少孤立地发生。特别是在不同专家为共同理解提供知识的情报分析假设中,成员必须交流如何发展洞察力,以建立合作者之间的共同点。使用来源来交流分析性感知,通过描述相互作用和概述为达成洞察力而采取的步骤而带来希望。然而,没有为在不同环境中传播来源而制定普遍准则。我们的工作重点是提供出处信息以及由此产生的结论和新分析员使用的战略。在开放的30分钟的文本探索假设中,我们从质量上比较不同种类的出处信息(具体数据覆盖面和互动历史)如何影响分析员对所制定的结论的信心、重复工作的倾向、数据过滤、相关信息的识别和典型的调查战略。我们看到数据覆盖范围(即与信息互动的方面)提供了证明信息,但又不限制个人调查自由。另一方面,互动历史(即当某些东西相互影响时)并不极大地鼓励更多的模拟,我们确实需要更多的时间来令人放心地理解分析,因为不太自信的结论和不那么重要的是,我们的经验性分析会如何有助于对实验性的行为进行分析。