Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

dspace-rdf is an extension for DSpace that adds capabilities to convert contents stored in DSpace into RDF, to store the converted data in a Triple Store and to provide it as serializations of RDF. The Triple Store must support SPARQL 1.1 and can be used to provide the converted data over a read-only SPARQL endpoint. dspace-rdf can currently be found on my github repositoriy, but I would be glad to contribute it to a future version of DSpace.

 

dspace-rdf is realized as a new module of DSpace as it contains a webapp and everyone should be able to decide whether it should be deployed or not. dspace-rdf contains several parts. You can see a simplified class diagramm here:

...

The files [dspace-install]/config/modules/rdf/metadata-*.ttl configure the MetadataConverterPlugin. This is the plugin that converts Item's metadata. The file metadata-prefixes.ttl can be used to specify prefixes (or namespaces) to be used in serializations of RDF that support such mechanism (g.e. Turtle or RDF/XML). The file metadata-rdf-schema.ttl is not a configuration file. It contains the schema describing the vocabulary used for the configuration of the MetadataConverterPlugin. The file metadata-rdf-mapping.ttl contains several mappings between metadata fields and triples that should be created to represent the specified metadata field. It already contains some examples which should help until it is better documented. The triples that should be created while converting a metada field are specified using RDF reification. You can specify regular expresions to create some triple only for metadata fields those values fulfils the specified regular expression. Two more regular expressions can be used to manipulate field values before they will be used to generate Literals or URIs. The file metadata-rdf-mapping.ttl already contains examples on how to generate URIs and Literals using metadata field values. When the metadata-rdf-mapping.ttl is loaded simple inferencing is used to detect most of the types of its entities. So you do not need to type every resource. Typing (by using statements with the property 'rdf:type' or the turtle shortcut 'a') is necessary for ResourceGenerators and LiteralGenerators only as there is no way to distinguish them using inferencing.

Links makes Linked Data out of RDF. Links can be created in dspace-rdf using ResourceGenerators in theWhile links are what distinguish Linked Data from simple RDF, it is important to use ResourceGenerators and regular expresions to create links.

RDFizer

The RDFizer is a command line interface administrators can use to convert the complete repository contents, some content specified by its handle or to delete data from the Triple Store. If the Triple Store is reachable the RDFConsumer converts data at the moment it is changed within DSpace so that the Triple Store should stay synchronized with the repository. You can get the online help by executing the following command:

...

If you add dspace-rdf to an existing DSpace instance you should run the RDFizer at leas once to initially convert already existing Items, Collections and Communities.

Development / API

 

TODOs

 

If you want to use RDF in other parts of DSpace you should take a look on the class org.dspace.rdf.RDFUtil as it acts as interface between dspace-rdf and the other parts of DSpace.

Beside RDFUtil you can use the classes in the package org.dspace.rdf.providing.negotiation for content negotiation. I added content negotiation to JSPUI already, but it should be added to XMLUI as well. It would be even better to use content negotiation directly in Persistent Identifier Resolvers that support content negotiation. The resolver under dx.doi.org supports content negotiation, but the DOIIdentifierProviders (currently contains a DOIIdentifierprovider for EZID and on for DataCite) must be extended before the content negotiation can be used.

TODOs

 dspace-rdf is realized as a new module of DSpace as it contains a webapp and everyone should be able to decide whether it should be deployed or not. The webapp is used to provide the data in serializations of RDF (RDF/XML, Turtle, N-Triples and N3-Notation).