Expectations

This proposal is based on the premis that changes to DSpace metadata characteristics must be backward comparable and retain the same functionality as previously existed to ease transitional for all existing users of the platform.  So many different functional areas of DSpace are reliant on existing metadata functionality, that it is criticial that any changes in functionality also have well defined and scripted updates across releases.

Primary Objective

The primary objective of this proposal is that the DSpace metadata registry be "naturally" extended to support a richer and more expressive "Metadata Schema". Technical Objectives of the rpoposal are to provide the following features:

  1. Capability to Define "Metadata Profiles" for specific DSpace Objects and/or types of Objects.
  2. Capability to Define DMCI "subPropertyOf" relationships outside of the legacy ns.element.qualifier approach
  3. Capability to have "immutable" DC, DCTERMS and other "well established" namespaces  
  4. Capability to Validate Existing DSpace item Metadata based on a profile that is either assigned via the parent container or directly tot he DSpace Item
  5. Capability to Apply these profiles similarly to DSpace Communities, Collection, Items, Bundles and Bitstreams.

Another very critical feature of this proposal is that this new Schema model should support the above features without significant need to transform existing DSpace Item metadata nor the registry itself.  

Conceptual Definition of "Schema"

The DSpace MetadataSchema registry was designed based on an outdated concept of "Application Profiles" and "Qualified Dublin Core" that predated the current DCMI Abstract Model.  Due to this, there are number of significant shortcomings to the current implementation.

  1. Namespaces are not "Schema"
  2. Qualification does not effectively meet needs for use of alternative namespaces while still providing clear mappings to DC for exposing metadata in OAI_DC.
  3. The Schema and Fields defined are insufficient to support validation of DSpace metadata fields in relation to Item Submission or other methods of Deposit.

The current "DSpace Schema" does not meet the requirements that a Schema is traditionally used for.  Schema are traditionally used to define a scaffolding or framework of rules which actual content can be validated against. While the current MetadataSchema/Field does restrict what can be assigned to any item in DSpace, it does not provide any support for validation of these assignments, nor allow us to further define the encoding of the metadata values nor if they are required or not.  At this time, much if of the validation, rules and encoding is poorly assigned instead, at the UI/Presentation level in the DSpace Submission input-forms.xml file and only enforced in the Describe Step of the Submission workflow.

This proposal seeks to extend the definition of the DSpace Metadata Schema to include support of these features previously found only in the Submission input-forms.xml. Formaizing a strategy for metadata validation in DSpace that is a new core feature.

Repurposing of MetadataSchema and MetadataField as Custom Metadata Template

Rather than MetadataSchema applying to the namespace of the metadata fields that are allowed by the repository.  We instead recommend that this table be repurposed to embody "templates" of MetadataFields that should be used for specific types of DSpace Objects.   Typing would be based on:

These above types will be expressed through the addition of properties to the MetadataSchemaRegistry and MetadataFieldRegistry tables to provide the facility to expand on and add additional Schema.  Some Hypothetical examples of such schema would be:

The above profiles could be applied heterogeniously though metadata attached to any level of the DSpace object hierarchy.

Metadata Field Inheritance

Individual Metadata Fields, like DCMI metadata properties will support subTyping or inheritance. For example, from the DCMI Website, we have the following:

http://dublincore.org/documents/dcmi-terms/#terms-title

Term Name:    title
URI:http://purl.org/dc/terms/title
Label:Title
Definition:A name given to the resource.
Type of Term:Property
Refines:http://purl.org/dc/elements/1.1/title
Version:http://dublincore.org/usage/terms/history/#titleT-002
Has Range:http://www.w3.org/2000/01/rdf-schema#Literal

Supporting a similar level of refinement for DSpace Metadata can be supported through the addition of new  MetadataFieldRegistry properties that are capable of storing this relationship.


The following are some basic  features of the proposal:

In the case of DSpace

 

IDFieldrefinesencodingdefaultrequiredScope Note
15dc.date.issueddc:dateW3CDTF${now}trueDate of publication or distribution.
10dc.datedc:dateW3CDTF${now} Use qualified form if possible.
25dc.identifier.uridc:identifierURI trueUniform Resource Identifier
17dc.identifierdc:identifierLiteral  Catch-all for unambiguous identifiers not defined by qualified form; use identifier.other for a known identifier common to a local collection instead of unqualified form.
38dc.language.isodc:languageRFC5646en Current ISO standard for language of intellectual content, including country codes (e.g. "en_US").
37dc.languagedc:languageRFC5646en Catch-all for non-ISO forms of the language of the item, accommodating harvested values.
44dc.relation.haspartdc:relationURI   References physically or logically contained item.
40dc.relationdc:relationURI   Catch-all for references to other related items.
62dc.subject.meshdc:subjectURI   MEdical Subject Headings
63dc.subject.otherdc:subjectLiteral   Local controlled vocabulary; global vocabularies will receive specific qualifier.
57dc.subjectdc:subjectLiteral   Uncontrolled index term.
65dc.title.alternativedc:titleTEXT  Varying (or substitute) form of title proper appearing in item, e.g. abbreviation or translation
64dc.titledc:titleTEXT trueTitle statement/title proper.
66dc.typedc:typeClass  Nature or genre of content.