Current text visualization techniques typically provide overviews of document content and structure using intrinsic properties such as term frequencies, co-occurrences, and sentence structures. Such visualizations lack conceptual overviews incorporating domain-relevant knowledge, needed when examining documents such as research articles or technical reports. To address this shortcoming, we present ConceptScope, a technique that utilizes a domain ontology to represent the conceptual relationships in a document in the form of a Bubble Treemap visualization. Multiple coordinated views of document structure and concept hierarchy with text overviews further aid document analysis. ConceptScope facilitates exploration and comparison of single and multiple documents respectively. We demonstrate ConceptScope by visualizing research articles and transcripts of technical presentations in computer science. In a comparative study with DocuBurst, a popular document visualization tool, ConceptScope was found to be more informative in exploring and comparing domain-specific documents, but less so when it came to documents that spanned multiple disciplines.
翻译:目前的文本可视化技术通常提供文件内容和结构的概览,使用诸如术语频率、共同发生和句子结构等内在特性。这种可视化缺乏概念性概览,缺乏在审查研究文章或技术报告等文件时所需的与领域相关的知识。为解决这一缺陷,我们介绍了概念范围,这是一种利用域本学在文件中以布布布林马普可视化形式代表概念关系的方法。对文件结构和概念等级的多重协调性看法与文本概览的进一步协助文件分析。概念范围有助于探索和比较单个和多个文件。我们通过对计算机科学方面的研究文章和技术演示文稿进行可视化来展示概念范围。在与DocuBurst(一种流行的文件可视化工具)的比较研究中,发现概念范围在探索和比较特定领域文件方面更为丰富,但在涉及跨多个学科的文件时则较少。