Page tree

Bleeding Edge

This documentation covers the bleeding edge version of Fedora. Looking for another version? See all documentation.

Skip to end of metadata
Go to start of metadata

RDF, as a graph, is inherently unordered, and this can lead to difficulty when forms of description that presuppose ordering are translated into it. The Fedora community is trying several methods for constructing order in a graph. Here are some examples:

The choice of how to represent ordering is up to you. In the following discussion we'll consider an ordered list of authors for an academic research paper. Generally the authors of these papers care about the order they are cited, and so in this scenario we must retain their ordering.

First a caveat: one might be tempted to use a structure that relies on RDF "blank nodes", such as most tools generate by default from the Collection notation in RDF Turtle. As mentioned in Common metadata design patterns, blank nodes are not well-defined in the repository context and should generally not be used in Fedora. So here's an example of what to avoid:

    @prefix dc: <http://purl.org/dc/elements/1.1/> .
    <>
      dc:title "Important Academic Research Paper" ;
      # Avoid the following
      dc:creator (<http://example.com/author/Quinn>  #no
                  <http://example.com/author/Alice>  #no
                  <http://example.com/author/Bob>) . #no

That example creates a bunch of blank nodes, which then get mangled going into Fedora. Let's not do that.

Here's a very simple alternative formation that gets the job done. We use hash-URIs instead of blank nodes.

    @prefix dc: <http://purl.org/dc/elements/1.1/> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    <>
      dc:title "Test title" ;
      dc:creator <#author_1>, <#author_2>, <#author_3> .     
    <#author_1> owl:sameAs <http://example.com/author/Quinn> .
    <#author_2> owl:sameAs <http://example.com/author/Alice> .
    <#author_3> owl:sameAs <http://example.com/author/Bob> . 

In addition to being Fedora-friendly, this formulation is arguably better from a representational standpoint: anybody iterating the triples can query for the paper by author without having to jump through rdf:list hijinx. Of course, those consumers that care about the ordering would need to parse the number out of the hash part of each URI and sort by that. For plenty of people that's good enough.

What if you wanted to represent ordering as a separate number, rather than parsing the hash-URIs? Here's a slightly more complex version that accomplishes that; here, each hash URIs can be anything you want, and the ordering is stored as a separate triple:

    @prefix dc: <http://purl.org/dc/elements/1.1/> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix schema: <https://schema.org/> .
    <>
      dc:title "Test title" ;
      dc:creator <#xyz>, <#abc>, <#123> .
    <#xyz> owl:sameAs <http://example.com/author/Quinn>; schema:Order 1 .
    <#abc> owl:sameAs <http://example.com/author/Alice>; schema:Order 2 .
    <#123> owl:sameAs <http://example.com/author/Bob>  ; schema:Order 3 .

If you prefer, you could go with a PCDM-style proxy ordering approach. This generates the most triples of any approach we've considered, but for some tool chains it makes good sense.

    @prefix dc: <http://purl.org/dc/elements/1.1/> .
    @prefix iana: <http://www.iana.org/assignments/relation/> .
    @prefix ore: <http://www.openarchives.org/ore/1.0/datamodel#> .
    <>
      dc:title "Test title" ;
      dc:creator <#creators> .
    <#creators> a ore:Aggregation;
                iana:first <#xyz>;
                iana:last <#123> .
    <#xyz> ore:proxyFor <http://example.com/author/Quinn>; 
           ore:proxyIn <#creators>;
           iana:next <#abc> .
    <#abc> ore:proxyFor <http://example.com/author/Alice>; 
           ore:proxyIn <#creators>;
           iana:prev <#xyz> ;
           iana:next <#123> .
    <#123> ore:proxyFor <http://example.com/author/Bob>; 
           ore:proxyIn <#creators>;
           iana:prev <#abc> .

As you can see there are many ways to do this, depending on how you want to model your metadata. Let's consider one final version in which we model authorship as an event which records not just the author involved but also the date of their involvement.

    @prefix dc: <http://purl.org/dc/elements/1.1/> .
    @prefix ex: <http://example.com/relations/> .
    <>
      dc:title "Test title" ;
      dc:creator <#authA>, <#authB>, <#authC> .
    <#authA> a ex:Authorship;
             ex:occurredOn "2016-05-15";
             ex:authorInvolved <http://example.com/author/Quinn>;
             ex:followingAuthor <#authB> .
    <#authB> a ex:Authorship;
             ex:occurredOn "2016-05-15";
             ex:authorInvolved <http://example.com/author/Alice>;
             ex:followingAuthor <#authC> .
    <#authC> a ex:Authorship;
             ex:occurredOn "2016-05-15";
             ex:authorInvolved <http://example.com/author/Bob> .


3 Comments

  1. Thank you! This is excellent.

  2. I don't think the example with ex:Authorship is correct. The dc:creators of the work should be the persons, not the Authorships. That should not change merely because we add ordering. Then the Authorships should have an explicit relation with the work at hand. And it would be nice to have some closure of the list. So I would say something like this:

    <> dc:creator <http://example.com/author/Quinn>, <http://example.com/author/Alice>, <http://example.com/author/Bob> .
        <#authA> a ex:Authorship;
                 ex:occurredOn "2016-05-15";
    ex:authorInvolved <http://example.com/author/Quinn>;
    ex:workInvolved <>;
    ex:followingAuthor <#authB> .
        <#authB> a ex:Authorship;
                 ex:occurredOn "2016-05-15";
                 ex:authorInvolved <http://example.com/author/Alice>;
    ex:workInvolved <>;
    ex:followingAuthor <#authC> .
        <#authC> a ex:Authorship;
                 ex:occurredOn "2016-05-15";
    ex:workInvolved <>;
    ex:authorInvolved <http://example.com/author/Bob> .
                 ex:followingAuthor rdf:nil .

    The closure with rdf:nil is copied from rdf collection. It can be achieved by other means as well, e.g. a boolean ex:isLastAuthor.
    The triples with ex:workInvolved can be replaced using the inverse, say ex:hasAuthorship, with

        <> ex:hasAuthorship <#authC>, <#authB>, <#authC> .

    In principle, the first three statements with dc:creator could be inferred from the rest if you have a proper ontology using propertyChain and other tricks, but some redundancy does not hurt.

    Alternatively (and I think preferably), there is a general collections ontology that was designed for cases like this. It has ordered lists without the use of blank nodes, and it is completely neutral about any additional semantics from other ontologies (contrary to what's happening here; with PCDM and author list doing ordering in a different way; btw PCDM style cannot be used here thanks all to the additional semantics attached to the ore ontology). In other words, you don't have to reinvent things when you want to apply ordered lists in some other context. See this article: http://semantic-web-journal.net/system/files/swj506.pdf . The ontology namespace (commonly prefixed co: ) is http://purl.org/co/ and that's also where the documentation should be but unfortunately there seems to be a problem with purl.org lately.

    The example, rewritten with the collections ontology:

        <> ex:hasCreatorList <#creatorList>.
    <#creatorList> a co:List;
    co:firstItem <#item1>;
    co:lastItem <#item3>.
    <#item1> a co:ListItem;
    co:itemContent <http://example.com/author/Quinn>;
    co:nextItem <#item2>.
    <#item2> a co:ListItem;
    co:itemContent <http://example.com/author/Alice>;
    co:nextItem <#item3>.
    <#item3> a co:ListItem;
    co:itemContent <http://example.com/author/Bob>.

    Again, given an appropriate ontology for ex: (by first defining co:nextItem as a subProperty of some transitive property and using that property and ex:hasCreatorList in a propertyChain that is a subProperty of dcterms:creator, and some additional fiddling to account for the first author), it should be possible to infer from this

        <> dcterms:creator <http://example.com/author/Quinn>,
    <http://example.com/author/Alice>,
    <http://example.com/author/Bob> .

    but of course, this can also be added explicitely.
    (I prefer to use dcterms instead of dc when linking to resources)

    This is the approach we will follow for generating rdf at Datacite.
    In this case, the ex: prefix in ex:hasCreatorList refers to http://purl.org/spar/datacite/

    Side note: the collections ontology (which also includes optional index numbers) could also be used for ordering in the PCDM model. Personally, I think that would have given a cleaner and neater version of PCDM.