Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

Consider a bibliographic or intellectual object with one or more associated digital content items ( e.g. images).  How might such an object be modeled in Fedora?

If the content items are all format variants of the same object, it may make sense to bundle the entire thing up into a single object in Fedora:

object = {description, tif, jpeg};

The content items might be separate datastreams, each potentially with a checksum and a format URI.

What if the content items include distinct content instead of (or as well as) format variation?  One might pursue the same strategy, but it would suddenly require many more assumptions:

object = {description, tif1, tif2, jpeg1};

with the understanding that datastream names themselves are significant and indicate associations.  One might also normalize the relationship of the content and the description:

object1 = {description, structure};

object2 = {tif, jpeg};

object3 = {tif};

<object1> <fedora-rels-ext:describes> <object2>;

<object1> <fedora-rels-ext:describes> <object3>;

This would require some accompanying sturctural metadata, stored somewhere, to elaborate how the original intellectual object should be reconstituted beyond the simple describes/HasDescription relationships.

There may also be a problem of technical metadata for content items (e.g. a mixfile). Barring the inclusion of a contentDigest-like reference within a datastream description, the scheme above would only allow a single stream of formatting data before reproducing the assumptions we tried to avoid above.  We could introduce yet another layer of indirection by moving the technical metadata into additional objects:

object4 = {techMD};

object5 = {techMD};

<object4> <fedora-rels-ext:isMetadataFor> <object2#tif>;

Does this begin making the graph too complex?  Is the association of multiple datastreams the quality that defines an atomic Fedora object? Some validation/format data may rightly be displaced to a format validation service if one emerges, but some seems inevitably required to fulfill long-term archiving requirements.