首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Efficient querying of multidimensional RDF data with aggregates: Comparing NoSQL,RDF and relational data stores
Institution:1. DISI – University of Bologna, Bologna, Italy;2. CINI, Rome, Italy;3. ESSI – Universitat Politècnica de Catalunya, Barcelona, Spain
Abstract:This paper proposes an approach to tackle the problem of querying large volume of statistical RDF data. Our approach relies on pre-aggregation strategies to better manage the analysis of this kind of data. Specifically, we define a conceptual model to represent original RDF data with aggregates in a multidimensional structure. A set of translations rules for converting a well-known multidimensional RDF modelling vocabulary into the proposed conceptual model is then proposed. We implement the conceptual model in six different data stores: two RDF triple stores (Jena TDB and Virtuoso), one graph-oriented NoSQL database (Neo4j), one column-oriented data store (Cassandra), and two relational databases (MySQL and PostGreSQL). We compare the querying performance, with and without aggregates, in these data stores. Experimental results, on real-world datasets containing 81.92 million triplets, show that pre-aggregation allows for reducing query runtime in all data stores. Neo4j NoSQL and relational databases with aggregates outperform triple stores speeding up to 99% query runtime.
Keywords:Statistical RDF data  Graph aggregation  NoSQL  Data analytics
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号