Read out

Guest Talk: The Swiss-Army Knife of Semantic Data Compression - RDF Dictionaries

Dr. Miguel A. Martínez-Prieto 

Date/Time: 07.05.2018, 16:00 

Location: D2.2.094 

Abstract 

RDF is more than a metadata data model for web resources, it is a publication philosophy at Web scale that has been followed for many data providers from some fields of knowledge. RDF success is not arguable, but managing voluminous amounts of semantic data has brought different challenges, like encoding these big RDF datasets, building efficient indexes for SPARQL query resolution, etc. Big Semantic Data compression has emerged as an active area of research, and has proposed different approaches to tackle these challenges. Despite of their differences, these approaches share a common component: the RDF dictionary. This data structure is like a Swiss knife, and plays different roles in practical scenarios. On the one hand, it (self-)indexes RDF terms (verbose URIs and large literals), allowing much symbolic redundancy to be removed, but also enabling RDF terms to be efficiently searched. On the other hand, it provides an effective mapping between RDF terms and integer IDs, allowing the original RDF graph to compacted into an ID-based representation. This talk will present the basics of RDF Compression and will delve into some details of how RDF dictionaries are organized, encoded or the types of queries that they can efficiently resolved in different use cases.

Bio 

Miguel A. Martinez-Prieto is assistant professor in the Department of Computer Science at the University of Valladolid (Segovia, Spain). He completed his Ph.D in Computer Science from the same University in 2010 and held a post-doctoral fellow at the University of Chile (2010-2012). His main contributions are related to data compression and its application to the efficient encoding and querying of big datasets. He has published more than 60 articles in this area, highlighting his work on compressed RDF management and string dictionaries. His current research is focused on Big Data analytics for air traffic management.



Back to overview