Efficient querying of multidimensional RDF data with aggregates: Comparing NoSQL,RDF and relational data stores期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Efficient querying of multidimensional RDF data with aggregates: Comparing NoSQL,RDF and relational data stores

Institution:	1. DISI – University of Bologna, Bologna, Italy;2. CINI, Rome, Italy;3. ESSI – Universitat Politècnica de Catalunya, Barcelona, Spain

Abstract:	This paper proposes an approach to tackle the problem of querying large volume of statistical RDF data. Our approach relies on pre-aggregation strategies to better manage the analysis of this kind of data. Specifically, we define a conceptual model to represent original RDF data with aggregates in a multidimensional structure. A set of translations rules for converting a well-known multidimensional RDF modelling vocabulary into the proposed conceptual model is then proposed. We implement the conceptual model in six different data stores: two RDF triple stores (Jena TDB and Virtuoso), one graph-oriented NoSQL database (Neo4j), one column-oriented data store (Cassandra), and two relational databases (MySQL and PostGreSQL). We compare the querying performance, with and without aggregates, in these data stores. Experimental results, on real-world datasets containing 81.92 million triplets, show that pre-aggregation allows for reducing query runtime in all data stores. Neo4j NoSQL and relational databases with aggregates outperform triple stores speeding up to 99% query runtime.

Keywords:	Statistical RDF data Graph aggregation NoSQL Data analytics
本文献已被 ScienceDirect 等数据库收录！