Thursday, February 26, 2009

RDF Transaction Isolation

Transaction Isolation in relational databases (for better or worse) is well established. However, the issue of transaction isolation is rarely documented in RDF stores.

The ANSI SQL isolation definitions are UPDATE (write) oriented and do not capture the general use case of RDF, which has no notion of UPDATE. For example, the first ANSI SQL phenomenon, dirty-write, is not even applicable to RDF transaction. Another phenomenon, non-repeatable reads, is defined by records retrieved by a SELECT statement. However, RDF queries (unlike SQL) are pattern based and the results don't have a direct relationship to any internal data record.

Relational database isolation mechanisms do not perform nearly as well when INSERT/DELETE operations are used instead of UPDATE. Furthermore, relational databases often have a lax definition of "serializable", allowing conflicting INSERT operations (assuming that preventing conflicting UPDATE operations is sufficient).

RDF is a different beast altogether. RDF is set oriented. Two RDF transactions adding or removing the same statement do not necessarily conflict with each other, as they would in SQL, because a successful add or remove operation in RDF does not require a state change.

Early RDF use cases required full serializable transaction as many of the inferencing rules used in RDF needed to take the complete store state into account. Because of this, RDF stores generally only provide full serializable transactions. However, full serializable transactions often do not perform as well as lower isolation levels.

RDF stores are now being used in environments that have a much greater, real-time demand, for fast concurrent write operations. These environments don't require full serialization, but currently lack any other isolation levels to choose from.

To address this need Sesame 3.0 introduces five isolation levels that will allow RDF stores to vary the isolation level provided. By providing different levels, significant performance improvements can be made for lower isolation levels. For example:

• Read Committed isolation level permits weak-consistency and allows proxies to cache repeated results without validation.
• Snapshot isolation level permits eventual-consistency and allows store clusters to maintain independent state and propagate the changes during idle periods.
• Serializable isolation provides a higher degree of isolation, but does not require atomic consistency, permitting concurrent transactions.

For more details on the isolation levels supported by Sesame 3.0 see:
http://wiki.aduna-software.org/confluence/display/SESDOC/TransactionIsolation

What variations of transaction isolation have you used in your application?

Reblog this post [with Zemanta]

Tuesday, February 24, 2009

Sesame 3-alpha1

The first preview of the new Sesame API is now available. Here is an article explaining the new features: http://www.devx.com/semantic/Article/40987

Reblog this post [with Zemanta]

Thursday, February 12, 2009

XHTML: What is it Good For?

With IE8 nearly upon us, discussion on the future of web standards has once again been triggered. Again lacking in IE8 is support for XHTML, ensuring that IE is the only browser that doesn't support it.

The goal of XHTML is to allow XML technologies to be used with HTML. XHTML has been a standard now for about five years, and IE is still (single handedly?) preventing websites from adopting it. Instead websites are forced to use server side solutions that require more bandwidth and processing, while making it harder for non-traditional agents to participate.

Within recent discussions on XHTML many people seem to fail to understand the potential benefit XHTML has over HTML. An interesting example I came across was combining XSLT and XHTML together.

I have written about XSLT before
http://jamesrdf.blogspot.com/2008/12/mvc-with-xslt.html

This example is from 2002 (around the time XSLT and XHTML were standardized) and can be found in more detail here:
http://www.webreference.com/xml/resources/books/practicalxml/chapter5/

As most of us know, XHTML is also an XML file and as such can be used as the input or output of XSLT. What many do not realize (and what is not covered in the above link) is that XHTML can also be used as the stylesheet. XSLT supports a simplified syntax that allows XSLT tags to be embedded inside an XHTML template file, making the template look a lot more like other server-side templating engines.
http://www.w3.org/TR/xslt#result-element-stylesheet

This allows you to use XHTML for the template and the content, and it works in all browsers, except *one* of them. Actually, you can get this to work in IE (even as early as IE5), but you have to use the XML rendering mode.

The XML rendering mode requires that the pages return application/xml and no doctypes present. Unfortunately by using the XML rendering mode, no HTML-specific features are available: no cookies, no document.write and script tags are parsed differently.

TV Series.com is rendered in XML mode, for example: http://www.tvseries.com/

If IE would get around to implementing XHTML, I think a lot more websites could safely switch to serving static files and the Web would start to become a lot easier to work with. But that probably wouldn't be good for Silverlight.

Reblog this post [with Zemanta]

Monday, February 2, 2009

Elmo to Get a Face Lift

In 2008 Elmo's interface saw some adjustments to enable more efficient access across HTTP. The challenge with this new interface is to understand how the object operations map to RDF operations and what implications they have on performance over http.

Despite the changes, EntityManager-oriented interface is still inappropriate for RDF/Object mapping, because relational operations do not directly map to RDF operations. Further, object persistence abstraction causes more performance problems then it solves for seasoned developers. Unlike object-relation mapping, RDF-object mapping is much more natural.

Work has already begun on Elmo's successor, AliBaba Object Repository, providing an hybrid RDF/Object interface extension to Sesame. Performance over HTTP is a high priority and was the major inspiration to the new interface.

Early development has already shown significant performance increases for both read heavy and write heavy transactions. For those interested in getting involved, you can checkout the code at
http://repo.aduna-software.org/svn/org.openrdf/alibaba/trunk/