Nowadays, there is a rapid increase in the number of sensor data generated by a wide variety of sensors and devices. Data semantics facilitate information exchange, adaptability, and interoperability among several sensors and devices. Sensor data and their meaning can be described using ontologies, e.g., the Semantic Sensor Network (SSN) Ontology. Notwithstanding, semantically enriched, the size of semantic sensor data is substantially larger than raw sensor data. Moreover, some measurement values can be observed by sensors several times, and a huge number of repeated facts about sensor data can be produced. We propose a compact or factorized representation of semantic sensor data, where repeated measurement values are described only once. Furthermore, these compact representations are able to enhance the storage and processing of semantic sensor data. To scale up to large datasets, factorization based, tabular representations are exploited to store and manage factorized semantic sensor data using Big Data technologies. We empirically study the effectiveness of a semantic sensor's proposed compact representations and their impact on query processing. Additionally, we evaluate the effects of storing the proposed representations on diverse RDF implementations. Results suggest that the proposed compact representations empower the storage and query processing of sensor data over diverse RDF implementations, and up to two orders of magnitude can reduce query execution time.
翻译:目前,由各种传感器和装置生成的传感器数据数量迅速增加,数据语义学为若干传感器和装置之间的信息交流、适应性和互操作性提供了便利。传感器数据及其含义可以用本体学来描述,例如语义感应器网络(SSN)的本体学。尽管如此,语义上丰富,语义感应数据的规模大大大于原始感应数据。此外,传感器可以观察到一些测量值,并可以生成大量关于传感器数据的反复事实。我们建议对语义感应器数据进行压缩或因数表示,其中只对重复的测量值作一次描述。此外,这些压缩表示能够加强语义感应数据的储存和处理。为了扩大至大型数据集,根据系数化,表格表示用于储存和管理使用大数据技术的因子化语义感应感应数据数据数据。我们实证地研究一个语义感应质感应器的拟议压缩表示的效果及其对查询处理的影响。此外,我们评估了将拟议存储的语义感应器储存到不同程度的频率显示系统的影响。我们还评估了对多种存储感应变系统执行的进度指示的进度分析。