You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Repositories and the Semantic Web

The most sites on the Internet are oriented towards human consumption. While HTML may be a good format to create websites it is not a good format to export data in a way a computer can work with. Like the most software for repositories DSpace support OAI-PMH as an interface to export the stored data. While OAI-PMH is well known in the field of repositories it is purely known elsewhere (g.e. Google retired its support for OAI-PMH in 2008). The Semantic Web is an approach to publish data in the Internet together with information about its semantic. The W3C released standards like RDF or SPARQL to help to bring data into the web in a way computer can easily work with. The data stored in repositories are particularly suited to be used in the Semantic Web as metadata are already available. They do not have to be generated or entered manually for publication as Linked Data. For the most repositories, at least for every Open Access Repository, it is quite important to share the stored content. Linked Data is a rather big chance for repositories to present their content in a way it can easily be accessed, interlinked and used.

EPrints is currently the only repository software I know, that is able to export its content as RDF. Nevertheless EPrints ignores some important conventions around Linked Data, so I would speak for RDF and not of Linked Data.

The main topics of my thesis (for german capable readers, they can be found on the Internet: http://www.pnjb.de/uni/diplomarbeit/repositorien_und_das_semantic_web.pdf) were how metadata and digital objects stored in repositories can be woven into the Linked (Open) Data Cloud and which characteristics of repositories have to be considered while doing so. As main part of my thesis I created a software independent concept how to provide repositories contents as Linked Data. I developed a proof of concept implementation of this concept as extension of DSpace. There are only some last steps left to be done before this implementation can be used in a productive environment and I would be glad if it would be added to DSpace as soon as it's ready.

dspace-rdf

dspace-rdf is an extension for DSpace that adds capabilities to convert contents stored in DSpace into RDF, to store the converted data in a Triple Store and to provide it in serializations of RDF. As the Triple Store must support SPARQL 1.1 it can be used to provide the converted data over an SPARQL endpoint. dspace-rdf can currently be found on my github repositoriy, but I would be glad to contribute it to a future version of DSpace.

dspace-rdf is realized as a new module of DSpace as it contains a webapp and everyone should be able to decide for its own if this webapp should be deployed or not.

 

  • No labels