Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We don't want to give a full introduction into the Semantic Web and its technologies here as this can be easily found in many places on the web. Nevertheless, we want to give a short glossary of the terms used most often in this context to make the following documentation more readable.

Semantic WebThe term "Semantic Web" refers to the part of the Internet containing Linked Data. Just like the World Wide Web, the Semantic Web is also woven together by links among the data.

Linked Data

Linked Open Data

Data in RDF, following the Linked Data Principles are called Linked Data. The Linked Data Principles describe the expected behavior of data publishers who shall ensure that the published data are easy to find, easy to retrieve, can be linked easily and link to other data as well.

Linked Open Data is Linked Data published under an open license. There is no technical difference between Linked Data and Linked Open Data (often abbreviated as LOD).  It is only a question of the license used to publish it.

RDF
RDF/XML
Turtle
N-Triples
N3-Notation
RDF is an acronym for Resource Description Framework, a metadata model. Don't think of RDF as a format, as it is a model. Nevertheless, there are different formats to serialize data following RDF. RDF/XML, Turtle, N-Triples and N3-Notation are probably the most well-known formats to serialize data in RDF. While RDF/XML uses XML, Turtle, N-Triples and N3-Notation don't and they are easier for humans to read and write. When we use RDF in DSpace configuration files, we currently prefer Turtle (but the code should be able to deal with any serialization).
Triple StoreA triple store is a database to natively store data following the RDF model. Just as you have to provide a relational database for DSpace, you have to provide a Triple Store for DSpace if you want to use the LOD support.
SPARQLThe SPARQL Protocol and RDF Query Language is a family of protocols to query triple stores. Since version 1.1, SPARQL can be used to manipulate triple stores as well, to store, delete or update data in triple stores. DSpace uses SPARQL 1.1 Graph Store HTTP Protocol and SPARQL 1.1 Query Language to communicate with the Triple Store. The SPARQL 1.1 Query Language is often referred to simply as SPARQL, so expect the SPARQL 1.1 Query Language if no other specific protocol out of the SPARQL family is explicitly specified.
SPARQL endpointA SPARQL endpoint is a SPARQL interface of a triple store. Since SPARQL 1.1, a SPARQL endpoint can be either read-only, allowing only to query the stored data; or readable and writable, allowing to modify the stored data as well. When talking about a SPARQL endpoint without specifying which SPARQL protocol is used, an endpoint supporting SPARQL 1.1 Query Language is meant.

Linked (Open) Data Support within DSpace

...

Info

Use Apache mod proxy, mod rewrite or any other appropriate web server/proxy to make localhost:3030/dspace/sparql readable from the internet. Use the address under which it is accessible as the address of your public sparql endpoint (see the property public.sparql.endpoint in the configuration reference below.).

 


The configuration provided within DSpace makes it store the files for the triple store under [dspace-install]/triplestore. Using this configuration, Fuseki provides three SPARQL endpoints:  two read-only endpoints and one that can be used to change the data of the triple store. You should not use this configuration if you let Fuseki connect to the internet directly as it would make it possible for anyone to delete, change or add information to the triple store. The option --localhost tells Fuseki to listen only on the loopback device. You can use Apache mod_proxy or any other web or proxy server to make the read-only SPARQL endpoint accessible from the internet. With the configuration described, Fueski listens to the port 3030 using HTTP. Using the address http://localhost:3030/ you can connect to the Fuseki Web UI.  http://localhost:3030/dspace/data addresses a writeable SPARQL 1.1 HTTP Graph Store Protocol endpoint, and  http://localhost:3030/dspace/get a read-only one. Under http://localhost:3030/dspace/sparql a read-only SPARQL 1.1 Query Language endpoint can be found. The first one of these endpoints must be not accessible by the internet, while the last one should be accessible publicly.

Default configuration and what you should change

First, you'll want to ensure the Linked Data endpoint is enabled/configured.  In your local.cfg, add rdf.enabled = true .   You can optionally change it's path by setting rdf.path (it defaults to "rdf" which means the Linked Data endpoint is available at [dspace.server.url]/rdf/  (where dspace.server.url is also specified in your local.cfg)

In the file [dspace]/config/dspace.cfg you should look for the property event.dispatcher.default.consumers and add rdf there. Adding rdf there makes DSpace update the triple store In the file [dspace-source]/dspace/config/dspace.cfg you should look for the property event.dispatcher.default.consumers and add rdf there. Adding rdf there makes DSpace update the triple store automatically as the publicly available content of the repository changes.

As the Linked Data support of DSpace is highly configurable this section gives a short list of things you probably want to configure before using it. Below you can find more information on what is possible to configure.

In the file [dspace-source]/dspace/config/modules/rdf.cfg you want to configure the address of the public sparql endpoint and the address of the writable endpoint DSpace use to connect to the triple store (the properties rdf.public.sparql.endpoint, rdf.storage.graphstore.endpoint). In the same file you want to configure the URL that addresses the dspace-rdf module which is depending on where you deployed it (property rdf.contextPath) and switch content negotiation on (set property rdf.contentNegotiation.enable = true).

In the file [dspace-source]/dspace/config/modules/rdf/constant-data-general.ttl you should change the links to the Web UI of the repository and the public readable SPARQL endpoint. The URL of the public SPARQL endpoint should point to a URL that is proxied by a webserver to the Triple Store. See the section Install a Triple Store above for further information.

In the file [dspace-source]/dspace/config/modules/rdf/constant-data-site.ttl you may add any triples that should be added to the description of the repository itself.

If you want to change the way the metadata fields are converted, take a look into the file [dspace-source]/dspace/configconfig/modules/rdf/metadata-rdf-mapping.ttl. This is also the place to add information on how to map metadata fields that you added to DSpace. There is already a quite acceptable default configuration for the metadata fields which DSpace supports out of the box. If you want to use some specific prefixes in RDF serializations that support prefixes, you have to edit [dspace-source]onfig/dspace/config/modules/rdf/metadata-prefixes.ttl.

...

[dspace-source]/dspace/config/modules/rdf.cfg

Property:rdf.enabled
Example Value: rdf.enabled = true
Informational Note:Defines whether the RDF endpoint is enabled or disabled (disabled by default). If enabled, the RDF endpoint is available at ${dspace.server.url}/${rdf.path}.  Changing this value requires rebooting your servlet container (e.g. Tomcat)
Property:rdf.path
Example Value: rdf.path = rdf
Informational Note:Defines the path of the RDF endpoint, if enabled. For example, a value of "rdf" (the default) means the RDF interface/endpoint is available at ${dspace.server.url}/rdf (e.g. if "dspace.server.url = http://localhost:8080/server", then it'd be available at "http://localhost:8080/server/rdf".  Changing this value requires rebooting your servlet container (e.g. Tomcat)
Property:rdf.contentNegotiation.enable
Example
Value:
rdf.contentNegotiation.enable = true
Informational
Note:
Defines whether content negotiation should be activated. Set this true, if you use Linked Data support.
Property:rdf.contextPath
Example
Value:
rdf.contextPath = ${dspace.baseUrl}/rdf
Informational
Note:
The content negotiation needs to know where to refer if anyone asks for RDF serializations of content stored within DSpace. This property sets the URL where the dspace-rdf module can be reached on the Internet (depending on how you deployed it).
Property:rdf.public.sparql.endpoint
Example
Value:
rdf.public.sparql.endpoint = http://${dspace.baseUrl}/sparql
Informational
Note:
Address of the read-only public SPARQL endpoint supporting SPARQL 1.1 Query Language.
Property:rdf.storage.graphstore.endpoint
Example
Value:
rdf.storage.graphstore.endpoint = http://localhost:3030/dspace/data
Informational
Note:
Address of a writable SPARQL 1.1 Graph Store HTTP Protocol endpoint. This address is used to create, update and delete converted data in the triple store. If you use Fuseki with the configuration provided as part of DSpace 5, you can leave this as it is. If you use another Triple Store or configure Fuseki on your own, change this property to point to a writeable SPARQL endpoint supporting the SPARQL 1.1 Graph Store HTTP Protocol.
Property:rdf.storage.graphstore.authentication
Example
Value:
rdf.storage.graphstore.authentication = no
Informational
Note:
Defines whether to use HTTP Basic authentication to connect to the writable SPARQL 1.1 Graph Store HTTP Protocol endpoint.
Properties:

rdf.storage.graphstore.login
rdf.storage.graphstore.password

Example
Values:

rdf.storage.graphstore.login = dspace
rdf.storage.graphstore.password =ecapsd

Informational
Note:
Credentials for the HTTP Basic authentication if it is necessary to connect to the writable SPARQL 1.1 Graph Store HTTP Protocol endpoint.
Property:rdf.storage.sparql.endpoint
Example
Value:
rdf.storage.sparql.endpoint = http://localhost:3030/dspace/sparql
Informational
Note:
Besides a writable SPARQL 1.1 Graph Store HTTP Protocol endpoint, DSpace needs a SPARQL 1.1 Query Language endpoint, which can be read-only. This property allows you to set an address to be used to connect to such a SPARQL endpoint. If you leave this property empty the property ${rdf.public.sparql.endpoint} will be used instead.
Properties:

rdf.storage.sparql.authentication
rdf.storage.sparql.login
rdf.storage.sparql.password

Example
Values:

rdf.storage.sparql.authentication = yes
rdf.storage.sparql.login = dspace
rdf.storage.sparql.password = ecapsd

Informational
Note:
As for the SPARQL 1.1 Graph Store HTTP Protocol you can configure DSpace to use HTTP Basic authentication to authenticate against the (read-only) SPARQL 1.1 Query Language endpoint.
Property:rdf.converter.DSOtypes
Example
Value:
rdf.converter.DSOtypes = SITE, COMMUNITY, COLLECTION, ITEM
Informational
Note:
Define which kind of DSpaceObjects should be converted. Bundles and Bitstreams will be converted as part of the Item they belong to. Don't add EPersons here unless you really know what you are doing. All converted data is stored in the triple store that provides a publicly readable SPARQL endpoint. So all data converted into RDF is exposed publicly. Every DSO type you add here must have an HTTP URI to be referenced in the generated RDF, which is another reason not to add EPersons here currently.
The following properties configure the StaticDSOConverterPlugin.
Properties:rdf.constant.data.GENERAL
rdf.constant.data.COLLECTION
rdf.constant.data.COMMUNITY
rdf.constant.data.ITEM
rdf.constant.data.SITE
Example
Values:

rdf.constant.data.GENERAL = ${dspace.dir}/config/modules/rdf/constant-data-general.ttl
rdf.constant.data.COLLECTION = ${dspace.dir}/config/modules/rdf/constant-data-collection.ttl
rdf.constant.data.COMMUNITY = ${dspace.dir}/config/modules/rdf/constant-data-community.ttl
rdf.constant.data.ITEM = ${dspace.dir}/config/modules/rdf/constant-data-item.ttl
rdf.constant.data.SITE = ${dspace.dir}/config/modules/rdf/constant-data-site.ttl

Informational
Note:

These properties define files to read static data from. These data should be in RDF, and by default Turtle is used as serialization. The data in the file referenced by the property ${rdf.constant.data.GENERAL} will be included in every Entity that is converted to RDF. E.g. it can be used to point to the address of the public readable SPARQL endpoint or may contain the name of the institution running DSpace.

The other properties define files that will be included if a DSpace Object of the specified type (collection, community, item or site) is converted. This makes it possible to add static content to every Item, every Collection, ...

The following properties configure the MetadataConverterPlugin.
Property:rdf.metadata.mappings
Example
Value:
rdf.metadata.mappings = ${dspace.dir}/config/modules/rdf/metadata-rdf-mapping.ttl
Informational
Note:
Defines the file that contains the mappings for the MetadataConverterPlugin. See below the description of the configuration file [dspace-source]/dspace/config/modules/rdf/metadata-rdf-mapping.ttl.
Property:rdf.metadata.schema
Example
Value:

rdf.metadata.schema = file://${dspace.dir}/config/modules/rdf/metadata-rdf-schema.ttl

Informational
Note:
Configures the URL used to load the RDF Schema of the DSpace Metadata RDF mapping Vocabulary. Using a file:// URI makes it possible to convert DSpace content without having an internet connection. The version of the schema has to be the right one for the used code.  In DSpace 5.0 we use the version 0.2.0. This Schema can be found here as well: http://digital-repositories.org/ontologies/dspace-metadata-mapping/0.2.0.  The newest version of the Schema can be found here: http://digital-repositories.org/ontologies/dspace-metadata-mapping/.
Property:

rdf.metadata.prefixes

Example
Value:

rdf.metadata.prefixes = ${dspace.dir}/config/modules/rdf/metadata-prefixes.ttl

Informational
Note:
If you want to use prefixes in RDF serializations that support prefixes, you can define these prefixes in the file referenced by this property.
The following properties configure the SimpleDSORelationsConverterPlugin
Property:rdf.simplerelations.prefixes
Example
Value:
rdf.simplerelations.prefixes = ${dspace.dir}/config/modules/rdf/simple-relations-prefixes.ttl
Informational
Note:
If you want to use prefixes in RDF serializations that support prefixes, you can define these prefixes in the file referenced by this property.
Property:rdf.simplerelations.site2community
Example
Value:
rdf.simplerelations.site2community = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasCommunity
Informational
Note:
Defines the predicates used to link from the data representing the whole repository to the top level communities. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.community2site
Example
Value:
rdf.simplerelations.community2site = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isPartOfRepository
Informational
Note:
Defines the predicates used to link from the top level communities to the data representing the whole repository. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.community2subcommunity
Example
Value:
rdf.simplerelations.community2subcommunity = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasSubcommunity
Informational
Note:
Defines the predicates used to link from communities to their subcommunities. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.subcommunity2community
Example
Value:
rdf.simplerelations.subcommunity2community = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isSubcommunityOf
Informational
Note:
Defines the predicates used to link from subcommunities to the communities they belong to. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.community2collection
Example
Value:
rdf.simplerelations.community2collection = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasCollection
Informational
Note:
Defines the predicates used to link from communities to their collections. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.collection2community
Example
Value:
rdf.simplerelations.collection2community = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isPartOfCommunity
Informational
Note:
Defines the predicates used to link from collections to the communities they belong to. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.collection2item
Example
Value:
rdf.simplerelations.collection2item = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasItem
Informational
Note:
Defines the predicates used to link from collections to their items. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.item2collection
Example Value:rdf.simplerelations.item2collection = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isPartOfCollection
Informational
Note:
Defines the predicates used to link from items to the collections they belong to. Defining multiple predicates separated by commas will result in multiple triples.
Property:rdf.simplerelations.item2bitstream
Example
Value:
rdf.simplerelations.item2bitstream = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasBitstream
Informational
Note:
Defines the predicates used to link from item to their bitstreams. Defining multiple predicates separated by commas will result in multiple triples.

[dspace-source]/dspace/config/modules/rdf/constant-data-*.ttl

...