Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Fedora resources can be structured physically or logically. Generally speaking, the repository will be more performant if resources are structured logically, given the increasing performance impact with the increasing number of *direct* children of a shared parent resource. In other words, if a parent resource ("book") has a thousand pages, and each of those pages were structured within Fedora as a direct children child of "book", that would probably impose negligible performance impact. However, if "book" had 10,000 children pages, then the creation of subsequent pages would likely be noticeably slower.

On the other hand, certain repository operations (e.g. authorization, nested move, nested export) are able to act over a tree of resources. For example, authorization policies can be defined to apply to all sub-resources within a tree of resources to the point until a descendant within the hierarchy overwrites the ancestor's policy.

...

That is both good and bad, from the perspectives of readable URLs as well as performance. Having an identifier that is semantically meaningful, clearly describing something about the nature of the resource (e.g. this resource is the first page of a specific book), can be nice at first. But invariably, the resource will either have to be relocated within the repository, or migrated to a different system, or in one way or another need to change have the semantics that were embedded in its identifier changed over time. 

The performance considerations (specifically for the case of adding new resources to the same parent) have been described above.

Generally speaking, within Fedora, opaque identifiers that hold no semantics should be favored over semantically meaningful identifiers. There are justifiable reasons for having structural hierarchies of resources in your repository, but having meaningful identifiers is likely not one of them. Fortunately, when creating a Fedora resource, if no user-provided identifier is included in the request, Fedora provides a default (pairtree * 4 / path elements terminated with a UUID) identifier that is designed to ensure a balanced tree of resources.

...

Resources (both containers and binaries) can be adorned with properties. A property is effectively a name/value pair. All resources have a set of system properties , that are managed by Fedora and not editable by users. Fedora also allows a resource to effectively have an unlimited number of user-defined properties. The "name" of the property in the name/value pair can be any term coming from any namespaced vocabulary (except, of course, from the Fedora system property vocabulary). The "value" in the name/value pair can be a URI or a literal. 

The result of these name/value pairs on resources is that when a request is made on a resource (i.e. HTTP /GET) the response that is returned is RDF with a set of RDF triples (subject - predicate - object) that further describe that the requested resource. The "subject" of those triples is the resource being requested. The "predicatepredicates" of those triples is the are namespaced term terms created as the "name" in the property's name/value pair. The "objectobjects" of those triples is are the URI URIs or literal literals created as the "valuevalues" in the property's name/value pair.

...

As may be expected, relationships between resources within, as well as external to the repository are defined like any other resource property, presumably with a URL as the "value" of the property which references the other resource. In the case where the resources within the repository are physically structured, Fedora adds a properties that note the containment of the child resources (http://www.w3.org/ns/ldp#contains and http://fedora/info/definitions/v4/repository#hasParent). As a slightly more advanced use, the user can define an additional default relationship properties that will be applied to parent and child resources.

...

Pulling all of these considerations together, the simplest approach to the "book" scenario would likely be to create all resources without user-provided identifiers. The "book" would be a container and the pages, thumbnail, and METS files would all be binaries. Relationships between the resources would be established (#hasPage, #nextPage, #isThumbNailOf, etc). To facilitate more robust search, the metadata packed into the METS file could be extracted out into properties. Likewise, using a property terms that are defined in commonly used, linked data vocabularies will aid in cross-web repository, cross-institutional normalization.

REST Examples