The Alexandria system under development at IBM Research provides an extensible framework and platform for supporting a variety of big-data analytics and visualizations. The system is currently focused on enabling rapid exploration of text-based social media data. The system provides tools to help with constructing "domain models" (i.e., families of keywords and extractors to enable focus on tweets and other social media documents relevant to a project), to rapidly extract and segment the relevant social media and its authors, to apply further analytics (such as finding trends and anomalous terms), and visualizing the results. The system architecture is centered around a variety of REST-based service APIs to enable flexible orchestration of the system capabilities; these are especially useful to support knowledge-worker driven iterative exploration of social phenomena. The architecture also enables rapid integration of Alexandria capabilities with other social media analytics system, as has been demonstrated through an integration with IBM Research's SystemG. This paper describes a prototypical usage scenario for Alexandria, along with the architecture and key underlying analytics.
翻译:IBM 研究所正在开发的亚历山大系统提供了支持各种大数据分析和可视化的扩展框架和平台,该系统目前侧重于快速探索基于文字的社交媒体数据,该系统提供了帮助构建“域模型”的工具(即关键词和提取器的家属,以便能够关注与项目相关的推文和其他社交媒体文件),快速提取和分割相关社交媒体及其作者,进一步应用分析(如查找趋势和异常术语)和将结果直观化,系统架构以基于REST的多种服务性API为中心,以便能够灵活地协调系统能力;这些对于支持知识工作者驱动的社会现象迭代探索特别有用,该架构还使得亚历山大的能力能够与其他社交媒体分析系统迅速融合,这一点通过与IMB 研究所的系统G的整合而得到证明。本文描述了亚历山大的原型使用情景,以及建筑和关键分析学基础。