Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0
Section
Wiki Markup
{center}!worddav76596e990aef6d51434b06ca03579b7a.png|height=123,width=573!{center}
Section
Wiki Markup

{center}
*Fedora Tutorial #1*
*Introduction to Fedora*
*Fedora 3.0* \\ \\
July 23, 2008 \\ \\
{center}

Author: The Fedora Development Team

...

A good discussion of the Fedora digital object model (for Fedora 2 and prior versions) exists in a recent paper (draft) published in the International Journal of Digital Libraries. While some details of this paper have been made obsolete (e.g. Disseminators) by the Content Model Architecture, a refinement of the original Fedora concepts introduced in Version 3.0, the core principles of the model remain the same. The Fedora digital object model is defined in XML schema language (see The Fedora Object XML - FOXML). For more information, also see the Introduction to FOXML in the Fedora System Documentation. A data object in a Fedora repository describes content (data and metadata) and a set of associated behaviors or services that can be applied to that content. Data objects comprise the bulk of a repository.

...

  • Datastream Identifier: an identifier for the Datastream that is unique within the digital object (but not necessarily globally unique)
  • State: the Datastream state of Active, Inactive, or Deleted
  • Created Date: the date/time that the Datastream was created (assigned by the repository service)
  • Modified Date: the date/time that the Datastream was modified (assigned by the repository service)
  • Versionable: an indicator (true/false) as to whether the repository service should version the Datastream. By default the repository versions all Datastreams.
  • Label: a descriptive label for the Datastream
  • MIME Type: the MIME type of the Datastream (required)
  • Format Identifier: an optional format identifier for the Datastream. Examples of emerging schemes are PRONOM and the Global Digital Format Registry (GDRF).
  • Alternate Identifiers: one or more alternate identifiers for the Datastream. Such identifiers could be local identifiers or global identifiers such as Handles or DOI.
  • Checksum: an integrity stamp for the Datastream which can be calculate using one of many standard algorithms (MD5, SHA-1, etc.)
  • Bytestream Content: the "stuff" of the Datastream is about (such as a document, digital image, video, metadata record)
  • Control Group: pertaining the the bytestream content, a new Datastream can be defined as one of four types, or control groups, as follows:
    • Internal XML Metadata - In this case, the Datastream will be stored as XML that is actually stored inline within the digital object XML file. The user may enter text directly into the editing window or data may imported from a file by clicking Import and selecting or browsing to the location of the XML metadata file.
    • Managed Content - In this case, the Datastream content will be stored in the Fedora repository and the digital object XML file will store an internal identifier to that Datastream. To get content, click Import and select or browse to the file location of the import file. Once import is complete, you will see the imported file in a preview box on the screen.
    • External Referenced Content - In this case, the Datastream content will be stored outside of the Fedora repository, and the digital object will store a URL to that Datastream. The Datastream is "by reference" since it is not actually stored inside the Fedora repository. While the Datastream content is stored outside of the Fedora repository, at runtime, when an access request for this type of Datastream is made, the Fedora repository will use this URL to get the content from its remote location, and the Fedora repository will mediate access to the content. This means that behind the scenes, Fedora will grab the content and stream in out the the client requesting the content as if it were served up directly by Fedora. This is a good way to create digital objects that point to distributed content, but still have the repository in charge of serving it up. To create this type of Datastream, specify the URL for the Datastream content in the Location URL text box.
    • Redirect Referenced Content - In this case, the Datastream content is also stored outside the repository and the digital object points to its URL ("by-reference"). However, unlike the External Referenced Content scenario, the Redirect scenario signals the repository to redirect to the URL when access requests are made for this Datastream. This means that the Datastream will not be streamed through the Fedora repository when it is served up. This is beneficial when you want a digital object to have a Datastream that is stored and served up by some external service, and you want the repository to get out of the way when it comes time to serve the content up. A good example is when you want a Datastream to be content that is stored and served by a streaming media server. In such a case, you would want to pass control to the media server to actually stream the content to a client (e.g., video streaming), rather than have Fedora in the middle re-streaming the content out. To create a Redirect Datastream, specify the URL for the content in the Location text box.

Digital Object Model

...

–-- An Access Perspective

Below is an alternative view of a Fedora digital object that shows the object from an access perspective. The digital object contains Datastreams and a set of object properties (simplified for depiction) as described above. A set of access points are defined for the object using the methods described below. Each access point is capable of disseminating a "representation" of the digital object. A representation may be considered a defined expression of part or all of the essential characteristics of the content. In many cases, direct dissemination of a bit stream is the only required access method; in most repository products this is only supported access method. However, Fedora also supports disseminating virtual representations based on the choices of content modelers and presenters using a full range of information and processing resources. The diagram shows all the access points defined for our example object.

...

Digital object relationship metadata is a way of asserting these various kinds of relationships for Fedora objects. A default set of common relationships is defined in the Fedora relationship ontology (actually, a simple RDF schema) which defines a set of common generic relationships useful in creating digital object networks. These relationships can be refined or extended. Also, communities can define their own ontologies to encode relationships among Fedora digital objects. Relationships are asserted from the perspective of one object to another object as in the following general pattern:

...

  • Organize objects into collections to support management, OAI harvesting, and user search/browse
  • Define bibliographic relationships among objects such as those defined in Functional Requirements for Bibliographic Records
  • Define semantic relationships among resources to record how objects relate to some external taxonomy or set of standards
  • Model a network overlay where resources are linked together based on contextual information (for example citation links or collaborative annotations)
  • Encode natural hierarchies of objects
  • Make cross-collection linkages among objects (for example show that a particular document in one collection can also be considered part another collection)

...

Fedora object-to-object metadata is encoded in XML using the Resource Description Framework (RDF). The relationship metadata must follow a prescribed RDF/XML authoring style where the subject is encoded using <rdf:Description>, the relationship is a property of the subject, and the target object is bound to the relationship property using the rdf:resource attribute. The subject and target of a relationship assertion must be URIs that identify Fedora digital objects. These URIs are based on Fedora object PIDs and conform to the syntax described for the fedora "info" URI scheme. The syntax for asserting relationships in RDF is as follows:

...

  1. The subject must be encoded as an <rdf:Description> element, with an "rdf:about" attribute containing the URI of the digital object in which the RELS-EXT Datastream resides. Thus, relationships are asserted about this object only. Relationship directionality is from this object to other objects.
  2. The relationship assertions must be RDF properties associated with the <rdf:Description>. Relationship assertions can be properties defined in the default Fedora relationship ontology, or properties from other namespaces.
  3. Prior to 2.1, the objects of relationships were restricted to other Fedora digital object URIs. This has since been relaxed so that a relationship property may reference any URI or literal, with the following exception: a relationship may not be self-referential, rdf:resource attribute must not point to the URI of the digital object that is the subject of the relationship.
  4. There must be only one <rdf:Description> in the RELS-EXT Datastream. One description can have as many relationship property assertions as necessary.
  5. There must be no nesting of assertions. Specifically, there cannot be an <rdf:Description> within an <rdf:Description>. In terms of XML "depth," the RDF root is considered at the depth of zero. The must be one <rdf:Description> element that must exist at the depth of one. The relationship assertions are RDF properties of the <rdf:Description> that exist at a depth of two.
  6. Assertions of properties from certain namespaces for forbidden in RELS-EXT. There must NOT be any assertion of properties from the Dublin Core namespace or from the FOXML namespace. This is because these assertions exist elsewhere in Fedora objects and may conflict if asserted in two places. The RELS-EXT Datastream is intended to be dedicated to solely object-to-object relationships and not used to make general descriptive assertions about objects.

...