Monday, March 8, 2010

Reinventing RDF Lists

Last month the SW interest group discussed alternatives to containers and collections as part of a discussion around what the next generation of RDF might look like. Below is my opinion on the matter.

RDF's simplistic approach makes it possible to encode most data structures, both simple and complex. The challenge people have with RDF, coming from other Web formats, is the lack of basic ordered collections (a concept common in XML). In RDF you are forced into a linked list structure just to preserve resource order. The linked list structure known as rdf:List is difficult to work with and highly ineffective within modern RDF stores.

Most RDF formats provide syntactic sugar to make it easier to write rdf:Lists. In turtle this is done using round brackets (parentheses); in RDF/XML this is done using the parseType collection attribute. However, because rdf:List is not a fundamental concept in RDF, no RDF store implementation preserves them, instead opting to use the fundamental triple form -- a linked list.

RDF is made of the following fundamental concepts: URI, Literal, and Blank Node. A fundamental list concept should be added to make it easier and more efficient to work with ordered collections. This would not have a significant effect on RDF formats, as their syntax would not change, but would have a significant impact on the mindset of RDF implementers.

With this change RDF implementers would strive to ensure that lists are implemented efficiently and provide convenient operations on them, just as they would other fundamental RDF concepts. The triple (linked list) form should be kept for compatibility with RDF systems that don't preserve lists, but the goal would be that RDF systems would not be obligated to provide a triple linked list form that has proven to be ineffective.

By making lists a fundamental RDF concept, there is no required change for RDF libraries to continue to be compatible with existing standards. Most libraries and systems may already understand list short hand and some may also preserve it.

Reblog this post [with Zemanta]

3 comments:

  1. I like your thinking on this but I would also want way to assign the list a URI. If I remember correctly the Turtle shorthand provides no way to assign a URI and so it generates a blank node to represent the list (and sub-lists).

    ReplyDelete
  2. The advantage of this is that it doesn't require any changes to the serialization of RDF, but it does require changes to SPARQL.

    ReplyDelete
  3. The advantage of this is that it doesn't require any changes to the serialization of RDF, but it does require changes to SPARQL.

    ReplyDelete