All Versions
- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
...
We don't want to give a full introduction into the Semantic Web and its technologies here as this can be easily found in many places on the web. Nevertheless, we want to give a short glossary of the terms used most often in this context to make the following documentation more readable.
Semantic Web | The term "Semantic Web" refers to the part of the Internet containing Linked Data. Just like the World Wide Web, the Semantic Web is also woven together by links among the data. |
---|---|
Linked Data Linked Open Data | Data in RDF, following the Linked Data Principles are called Linked Data. The Linked Data Principles describe the expected behavior of data publishers who shall ensure that the published data are easy to find, easy to retrieve, can be linked easily and link to other data as well. Linked Open Data is Linked Data published under an open license. There is no technical difference between Linked Data and Linked Open Data (often abbreviated as LOD). It is only a question of the license used to publish it. |
RDF RDF/XML Turtle N-Triples N3-Notation | RDF is an acronym for Resource Description Framework, a metadata model. Don't think of RDF as a format, as it is a model. Nevertheless, there are different formats to serialize data following RDF. RDF/XML, Turtle, N-Triples and N3-Notation are probably the most well-known formats to serialize data in RDF. While RDF/XML uses XML, Turtle, N-Triples and N3-Notation don't and they are easier for humans to read and write. When we use RDF in DSpace configuration files, we currently prefer Turtle (but the code should be able to deal with any serialization). |
Triple Store | A triple store is a database to natively store data following the RDF model. Just as you have to provide a relational database for DSpace, you have to provide a Triple Store for DSpace if you want to use the LOD support. |
SPARQL | The SPARQL Protocol and RDF Query Language is a family of protocols to query triple stores. Since version 1.1, SPARQL can be used to manipulate triple stores as well, to store, delete or update data in triple stores. DSpace uses SPARQL 1.1 Graph Store HTTP Protocol and SPARQL 1.1 Query Language to communicate with the Triple Store. The SPARQL 1.1 Query Language is often referred to simply as SPARQL, so expect the SPARQL 1.1 Query Language if no other specific protocol out of the SPARQL family is explicitly specified. |
SPARQL endpoint | A SPARQL endpoint is a SPARQL interface of a triple store. Since SPARQL 1.1, a SPARQL endpoint can be either read-only, allowing only to query the stored data; or readable and writable, allowing to modify the stored data as well. When talking about a SPARQL endpoint without specifying which SPARQL protocol is used, an endpoint supporting SPARQL 1.1 Query Language is meant. |
...
Info |
---|
Use Apache mod proxy, mod rewrite or any other appropriate web server/proxy to make localhost:3030/dspace/sparql readable from the internet. Use the address under which it is accessible as the address of your public sparql endpoint (see the property public.sparql.endpoint in the configuration reference below.). |
The configuration provided within DSpace makes it store the files for the triple store under [dspace-install]/triplestore. Using this configuration, Fuseki provides three SPARQL endpoints: two read-only endpoints and one that can be used to change the data of the triple store. You should not use this configuration if you let Fuseki connect to the internet directly as it would make it possible for anyone to delete, change or add information to the triple store. The option --localhost tells Fuseki to listen only on the loopback device. You can use Apache mod_proxy or any other web or proxy server to make the read-only SPARQL endpoint accessible from the internet. With the configuration described, Fueski listens to the port 3030 using HTTP. Using the address http://localhost:3030/ you can connect to the Fuseki Web UI. http://localhost:3030/dspace/data addresses a writeable SPARQL 1.1 HTTP Graph Store Protocol endpoint, and http://localhost:3030/dspace/get a read-only one. Under http://localhost:3030/dspace/sparql a read-only SPARQL 1.1 Query Language endpoint can be found. The first one of these endpoints must be not accessible by the internet, while the last one should be accessible publicly.
First, you'll want to ensure the Linked Data endpoint is enabled/configured. In your local.cfg
, add rdf.enabled = true
. You can optionally change it's path by setting rdf.path
(it defaults to "rdf" which means the Linked Data endpoint is available at [dspace.server.url]/rdf/
(where dspace.server.url
is also specified in your local.cfg
)
In the file [dspace]/config/dspace.cfg
you should look for the property event.dispatcher.default.consumers
and add rdf
there. Adding rdf there makes DSpace update the triple store In the file [dspace-source]/dspace/config/dspace.cfg
you should look for the property event.dispatcher.default.consumers
and add rdf
there. Adding rdf there makes DSpace update the triple store automatically as the publicly available content of the repository changes.
As the Linked Data support of DSpace is highly configurable this section gives a short list of things you probably want to configure before using it. Below you can find more information on what is possible to configure.
In the file [dspace-source]/dspace/config/modules/rdf.cfg
you want to configure the address of the public sparql endpoint and the address of the writable endpoint DSpace use to connect to the triple store (the properties rdf.public.sparql.endpoint
, rdf.storage.graphstore.endpoint
). In the same file you want to configure the URL that addresses the dspace-rdf module which is depending on where you deployed it (property rdf.contextPath
) and switch content negotiation on (set property rdf.contentNegotiation.enable = true
).
In the file [dspace-source]/dspace/config/modules/rdf/constant-data-general.ttl
you should change the links to the Web UI of the repository and the public readable SPARQL endpoint. The URL of the public SPARQL endpoint should point to a URL that is proxied by a webserver to the Triple Store. See the section Install a Triple Store above for further information.
In the file [dspace-source]/dspace/config/modules/rdf/constant-data-site.ttl
you may add any triples that should be added to the description of the repository itself.
If you want to change the way the metadata fields are converted, take a look into the file [dspace-source]/dspace/configconfig/modules/rdf/metadata-rdf-mapping.ttl
. This is also the place to add information on how to map metadata fields that you added to DSpace. There is already a quite acceptable default configuration for the metadata fields which DSpace supports out of the box. If you want to use some specific prefixes in RDF serializations that support prefixes, you have to edit [dspace-source]onfig/dspace/config/modules/rdf/metadata-prefixes.ttl
.
...
Property: | rdf.enabled |
Example Value: | rdf.enabled = true |
Informational Note: | Defines whether the RDF endpoint is enabled or disabled (disabled by default). If enabled, the RDF endpoint is available at ${dspace.server.url}/${rdf.path}. Changing this value requires rebooting your servlet container (e.g. Tomcat) |
Property: | rdf.path |
Example Value: | rdf.path = rdf |
Informational Note: | Defines the path of the RDF endpoint, if enabled. For example, a value of "rdf" (the default) means the RDF interface/endpoint is available at ${dspace.server.url}/rdf (e.g. if "dspace.server.url = http://localhost:8080/server", then it'd be available at "http://localhost:8080/server/rdf". Changing this value requires rebooting your servlet container (e.g. Tomcat) |
Property: | rdf.contentNegotiation.enable |
Example Value: | rdf.contentNegotiation.enable = true |
Informational Note: | Defines whether content negotiation should be activated. Set this true, if you use Linked Data support. |
Property: | rdf.contextPath |
Example Value: | rdf.contextPath = ${dspace.baseUrl}/rdf |
Informational Note: | The content negotiation needs to know where to refer if anyone asks for RDF serializations of content stored within DSpace. This property sets the URL where the dspace-rdf module can be reached on the Internet (depending on how you deployed it). |
Property: | rdf.public.sparql.endpoint |
Example Value: | rdf.public.sparql.endpoint = http://${dspace.baseUrl}/sparql |
Informational Note: | Address of the read-only public SPARQL endpoint supporting SPARQL 1.1 Query Language. |
Property: | rdf.storage.graphstore.endpoint |
Example Value: | rdf.storage.graphstore.endpoint = http://localhost:3030/dspace/data |
Informational Note: | Address of a writable SPARQL 1.1 Graph Store HTTP Protocol endpoint. This address is used to create, update and delete converted data in the triple store. If you use Fuseki with the configuration provided as part of DSpace 5, you can leave this as it is. If you use another Triple Store or configure Fuseki on your own, change this property to point to a writeable SPARQL endpoint supporting the SPARQL 1.1 Graph Store HTTP Protocol. |
Property: | rdf.storage.graphstore.authentication |
Example Value: | rdf.storage.graphstore.authentication = no |
Informational Note: | Defines whether to use HTTP Basic authentication to connect to the writable SPARQL 1.1 Graph Store HTTP Protocol endpoint. |
Properties: | rdf.storage.graphstore.login |
Example Values: | rdf.storage.graphstore.login = dspace |
Informational Note: | Credentials for the HTTP Basic authentication if it is necessary to connect to the writable SPARQL 1.1 Graph Store HTTP Protocol endpoint. |
Property: | rdf.storage.sparql.endpoint |
Example Value: | rdf.storage.sparql.endpoint = http://localhost:3030/dspace/sparql |
Informational Note: | Besides a writable SPARQL 1.1 Graph Store HTTP Protocol endpoint, DSpace needs a SPARQL 1.1 Query Language endpoint, which can be read-only. This property allows you to set an address to be used to connect to such a SPARQL endpoint. If you leave this property empty the property ${rdf.public.sparql.endpoint} will be used instead. |
Properties: | rdf.storage.sparql.authentication |
Example Values: | rdf.storage.sparql.authentication = yes |
Informational Note: | As for the SPARQL 1.1 Graph Store HTTP Protocol you can configure DSpace to use HTTP Basic authentication to authenticate against the (read-only) SPARQL 1.1 Query Language endpoint. |
Property: | rdf.converter.DSOtypes |
Example Value: | rdf.converter.DSOtypes = SITE, COMMUNITY, COLLECTION, ITEM |
Informational Note: | Define which kind of DSpaceObjects should be converted. Bundles and Bitstreams will be converted as part of the Item they belong to. Don't add EPersons here unless you really know what you are doing. All converted data is stored in the triple store that provides a publicly readable SPARQL endpoint. So all data converted into RDF is exposed publicly. Every DSO type you add here must have an HTTP URI to be referenced in the generated RDF, which is another reason not to add EPersons here currently. |
The following properties configure the StaticDSOConverterPlugin. | |
---|---|
Properties: | rdf.constant.data.GENERAL rdf.constant.data.COLLECTION rdf.constant.data.COMMUNITY rdf.constant.data.ITEM rdf.constant.data.SITE |
Example Values: | rdf.constant.data.GENERAL = ${dspace.dir}/config/modules/rdf/constant-data-general.ttl |
Informational Note: | These properties define files to read static data from. These data should be in RDF, and by default Turtle is used as serialization. The data in the file referenced by the property ${rdf.constant.data.GENERAL} will be included in every Entity that is converted to RDF. E.g. it can be used to point to the address of the public readable SPARQL endpoint or may contain the name of the institution running DSpace. The other properties define files that will be included if a DSpace Object of the specified type (collection, community, item or site) is converted. This makes it possible to add static content to every Item, every Collection, ... |
The following properties configure the MetadataConverterPlugin. | |
Property: | rdf.metadata.mappings |
Example Value: | rdf.metadata.mappings = ${dspace.dir}/config/modules/rdf/metadata-rdf-mapping.ttl |
Informational Note: | Defines the file that contains the mappings for the MetadataConverterPlugin. See below the description of the configuration file [dspace-source]/dspace/config/modules/rdf/metadata-rdf-mapping.ttl. |
Property: | rdf.metadata.schema |
Example Value: | rdf.metadata.schema = file://${dspace.dir}/config/modules/rdf/metadata-rdf-schema.ttl |
Informational Note: | Configures the URL used to load the RDF Schema of the DSpace Metadata RDF mapping Vocabulary. Using a file:// URI makes it possible to convert DSpace content without having an internet connection. The version of the schema has to be the right one for the used code. In DSpace 5.0 we use the version 0.2.0. This Schema can be found here as well: http://digital-repositories.org/ontologies/dspace-metadata-mapping/0.2.0. The newest version of the Schema can be found here: http://digital-repositories.org/ontologies/dspace-metadata-mapping/. |
Property: | rdf.metadata.prefixes |
Example Value: | rdf.metadata.prefixes = ${dspace.dir}/config/modules/rdf/metadata-prefixes.ttl |
Informational Note: | If you want to use prefixes in RDF serializations that support prefixes, you can define these prefixes in the file referenced by this property. |
The following properties configure the SimpleDSORelationsConverterPlugin | |
Property: | rdf.simplerelations.prefixes |
Example Value: | rdf.simplerelations.prefixes = ${dspace.dir}/config/modules/rdf/simple-relations-prefixes.ttl |
Informational Note: | If you want to use prefixes in RDF serializations that support prefixes, you can define these prefixes in the file referenced by this property. |
Property: | rdf.simplerelations.site2community |
Example Value: | rdf.simplerelations.site2community = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasCommunity |
Informational Note: | Defines the predicates used to link from the data representing the whole repository to the top level communities. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.community2site |
Example Value: | rdf.simplerelations.community2site = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isPartOfRepository |
Informational Note: | Defines the predicates used to link from the top level communities to the data representing the whole repository. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.community2subcommunity |
Example Value: | rdf.simplerelations.community2subcommunity = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasSubcommunity |
Informational Note: | Defines the predicates used to link from communities to their subcommunities. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.subcommunity2community |
Example Value: | rdf.simplerelations.subcommunity2community = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isSubcommunityOf |
Informational Note: | Defines the predicates used to link from subcommunities to the communities they belong to. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.community2collection |
Example Value: | rdf.simplerelations.community2collection = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasCollection |
Informational Note: | Defines the predicates used to link from communities to their collections. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.collection2community |
Example Value: | rdf.simplerelations.collection2community = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isPartOfCommunity |
Informational Note: | Defines the predicates used to link from collections to the communities they belong to. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.collection2item |
Example Value: | rdf.simplerelations.collection2item = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasItem |
Informational Note: | Defines the predicates used to link from collections to their items. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.item2collection |
Example Value: | rdf.simplerelations.item2collection = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/ontologies/dspace/0.1.0#isPartOfCollection |
Informational Note: | Defines the predicates used to link from items to the collections they belong to. Defining multiple predicates separated by commas will result in multiple triples. |
Property: | rdf.simplerelations.item2bitstream |
Example Value: | rdf.simplerelations.item2bitstream = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/ontologies/dspace/0.1.0#hasBitstream |
Informational Note: | Defines the predicates used to link from item to their bitstreams. Defining multiple predicates separated by commas will result in multiple triples. |
...