Guest Talk: The Swiss-Army Knife of Semantic Data Compression - RDF Dictionaries

Dr. Miguel A. Martínez-Prieto 

Date/Time: 07.05.2018, 16:00 

Loca­tion: D2.2.094 


RDF is more than a meta­data data model for web resources, it is a publi­ca­tion philo­sophy at Web scale that has been followed for many data provi­ders from some fields of know­ledge. RDF success is not arguable, but mana­ging volu­mi­nous amounts of semantic data has brought diffe­rent chal­lenges, like enco­ding these big RDF data­sets, buil­ding effi­cient indexes for SPARQL query reso­lu­tion, etc. Big Semantic Data compres­sion has emerged as an active area of rese­arch, and has proposed diffe­rent approa­ches to tackle these chal­lenges. Despite of their diffe­rences, these approa­ches share a common compo­nent: the RDF dictio­nary. This data struc­ture is like a Swiss knife, and plays diffe­rent roles in prac­tical scena­rios. On the one hand, it (self-)indexes RDF terms (verbose URIs and large lite­rals), allo­wing much symbolic redun­dancy to be removed, but also enab­ling RDF terms to be effi­ci­ently sear­ched. On the other hand, it provides an effec­tive mapping between RDF terms and integer IDs, allo­wing the original RDF graph to compacted into an ID-based repre­sen­ta­tion. This talk will present the basics of RDF Compres­sion and will delve into some details of how RDF dictio­na­ries are orga­nized, encoded or the types of queries that they can effi­ci­ently resolved in diffe­rent use cases.


Miguel A. Marti­ne­z-Prieto is assis­tant professor in the Depart­ment of Computer Science at the Univer­sity of Valla­dolid (Segovia, Spain). He completed his Ph.D in Computer Science from the same Univer­sity in 2010 and held a post-doc­toral fellow at the Univer­sity of Chile (2010-2012). His main cont­ri­bu­tions are related to data compres­sion and its appli­ca­tion to the effi­cient enco­ding and querying of big data­sets. He has published more than 60 arti­cles in this area, high­lighting his work on compressed RDF manage­ment and string dictio­na­ries. His current rese­arch is focused on Big Data analy­tics for air traffic manage­ment.

