Programming The Web: Sesame

Showing posts with label Sesame. Show all posts

Sunday, January 15, 2012

Blob Store

In release 2.0-beta14 (I know, this is the late beta release) AliBaba introduced a new BLOB store. The blob store integrates with the RDF repository ObjectRepository to synchronize transactions. This allows both the BLOB store and the RDF store to be isolated and always consistent with one another. This is done using two-phase commit transactions in the BLOB store.

The BLOB store also has a few other advantages over a traditional file system. First every change is isolated until it is closed/committed. This prevents other readers from see an incomplete BLOB and help prevent inconsistency between the BLOB and RDF stores. In additional, as disk space is generally considered cheap, all past versions of BLOBs are keep on disk by default. This allows any previous versions to be retrieved (and restored) using the API.

The BLOB store API is fairly simple. Here what some code might look like using the BLOB store.

BlobStoreFactory factory = BlobStoreFactory.newInstance();

BlobStore store = factory.openBlobStore(new File("."));

String key = "http://example.com/store1/key1";

BlobObject blob = store.open(key);

OutputStream out = blob.openOutputStream();

try {

// write stream to out

} finally {

out.close();

}

InputStream in = blob.openInputStream();

try {

// read stream from in

} finally {

in.close();

}

More API options can be see in the JavaDocs:

http://www.openrdf.org/doc/alibaba/2.0-beta14/apidocs/org/openrdf/store/blob/package-summary.html

Friday, July 31, 2009

SPARQL Federation and Quints

There are currently a couple popular way to federate sparql endpoints together:

1) In Jena the service must be explicitly part of the query, and therefor the model,

2) In Sesame the basic query patterns must be associated with one or more endpoints before evaluating the query, or

3) Hack the remote query into a graph URI: http://gearon.blogspot.com/2009/05/federated-queries-long-time-ago-tks.html

Although both can be used to achieve the same results, Jena's solution puts more responsibility in the data model, and Sesame's put more responsibility in the deployment. Both have their trade offs, but I believe the query is suppose to be abstracted away from underlying services. The domain model (and therefore the queries) should not be aware of how the data is distributed (or stored) across a network. Therefore, I prefer to describe which graph patterns and relationships are available at each endpoint during deployment and make the application model independent of available service endpoints.

Furthermore, I think it is a bit silly to add yet another level of complexity to the basic query pattern. Adding the service level turns the basic query pattern from a quad to a quint.

To fully index a quint (with support for a service variable, which Jena does not support) would take 13 indexes (nearly double what a quad requires). Below is a table of some complexity levels and how many indexes they require to be fully indexed (variables could appear in any position within the pattern). I have included a theoretical sext that would allow you to group services in a network (just as graphs can be grouped in a service).

Level	#ofIdx	Term	Data Structure
double	2	subject	directed graph
triple	3	predicate	labelled directed graph
quad	7	graph	multiple labelled directed graphs
quint	13	service	replicated multiple labelled directed graphs
sext	25	network	trusted replicated multiple labelled directed graphs

Switching from triples to quad provides a big functionality leap (the ability to refer to an entire graph as a single resource). However, I question how much functionality a quint (or a sext) has over a quad. Couldn't the same functionality be put into a property of the graph (or embedded in the graph's URI authority). An inferencing engine/query could also conclude graph relationships like (subGraphOf), which would still allow a large, but precise, collection of graphs to be queried more effectively.

Hopefully, this topic will have more time to mature before the SPARQL working group makes any official decisions on the matter.

Tuesday, February 24, 2009

Sesame 3-alpha1

The first preview of the new Sesame API is now available. Here is an article explaining the new features: http://www.devx.com/semantic/Article/40987

Friday, October 24, 2008

Sesame 2.2.1 Released

This marks the first stable release of Mulgara's Sesame interface. Creating a unified API to access many specialized RDF stores.

Other RDF stores that support the Sesame API include:
OWLIM
Virtuoso
BigData
AllegroGraph

For more information about Sesame see:
http://www.openrdf.org/

Programming The Web

Sunday, January 15, 2012

Blob Store

Friday, July 31, 2009

SPARQL Federation and Quints

Tuesday, February 24, 2009

Sesame 3-alpha1

Friday, October 24, 2008

Sesame 2.2.1 Released

About Me

Blog Archive

Other Blogs

Programming The Web

Sunday, January 15, 2012

Blob Store

Friday, July 31, 2009

SPARQL Federation and Quints

Tuesday, February 24, 2009

Sesame 3-alpha1

Friday, October 24, 2008

Sesame 2.2.1 Released

About Me

Blog Archive

Other Blogs

Subscribe