You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

---   THIS PAGE IS IN DRAFT ----

Duke University - Trident Project

At Duke, we have impemented a Fedora-based repository, and a web-based editor interface to manage the repository.  The design is very modular, and very flexible.  It was designed to be extended to fit different data models, as well as it is configurable to work with different descriptive metadata schemas.  In addition, it was designed not only to work with different metadata schema, but can be configured to have different metadata requirements per "collection" (ie. Topical Subject is required for items in collection A but is optional for items in collection B).  In order to make the editor configurable to work with different metadata schemas, we decided to make the editor completely dynamic and driven by data from the repository which instructs the editor how to create metadata forms for editing.  I will briefly explain below what led us to this decision as well as explain how it works.  There are also more details on our internal project wiki, http://library.duke.edu/trac/dc/wiki/Trident/MetadataApplicationProfile .

Editing XML Metadata in HTML Forms

The first challenge that we face when dealing with XML metadata is that we want to make it simple to edit.  XML is certainly readable and there are editors available to simplify creating well-formed XML.  But, in libraries, often we deal with complex schemas, and XML nodes have attributes, and sometimes we nest XML nodes inside other nodes, and some nodes are repeatable and others not, and some fields' values should be restricted to a defined authority list, and we can leave a lot open to the interpretation of the person who is editing the XML directly.  In my experience, direct editing of the XML requires an intimate knowledge of the schema. 

Also, direct editing of the XML is sometimes challenging when the XML is already integrated into a repository system, such as Fedora.  We don't want our catalogers to have to navigate the Fedora Administrative Client in order to edit descriptive metadata. 

Rather, we opt for web interfaces for editing the metadata.  And we want to allow our catalogers to use html forms to create and update the descriptive metadata.  With web forms, we can simplify the metadata entry, create contextual help for each metadata field, selects or lookups against authority lists, and we can provide mechanisms for error checking.  OK, so we decided that we wanted to use HTML forms.

Mapping XML to HTML Forms

Now, we needed to figure out how to map the XML metadata into the HTML forms.  HTML forms are very flat.  XML is not necessarily so, nodes can be repeatable, nodes can have attributes, nodes can be optional or required, nodes can be nested inside other nodes.  There are a lot of challenges that need to be considered. 

The first step was to map the XML into a relatively flat format so that it could be turned into an HTML form.  So, we decided that we would transform the native XML into a flat XML schema, which we call the Metadata Form schema, so that the Metadata Form could be consumed by the editor interface and turned into an HTML form.  In the Metadata Form, we conceived of fields and field_groups.  A field corresponds to a top-level element in the native XML schema, for instance, dc:title in Dublin Core.  Since top-level elements can have defined attributes as well as further nested elements, we conceived of elements in the Metadata Form that are children of a field.

Let's say for the sake of a simple example, that we are using Dublin Core, and we want to map a dc:title element.

<dc:title type='main'>The main title</dc:title>

In our Metadata Form, this would map to:

<field name='title'>
  <element name='type'>main</element>
  <element name='title'>The main title</element>
</field>

We further defined field groups within our Metadata Form to handle repeatable fields.  So adding on to the previous example, let's say we add an alternate title:

<dc:title type='main'>The main title</dc:title>
<dc:title type='alternate'>An alternate title</dc:title>

Using the same Metadata Form mapping:

<field_group name='title'>
  <field>
    <element name='type'>main</element>
    <element name='title'>The main title</element>
  </field>
  <field>
    <element name='type'>alternate</element>
    <element name='title'>An alternate title</element>
  </field>
</field_group>

As we will see later in this discussion, we may want to treat main and alternate titles separately.  It may be the case that the main title is required and not repeatable, while the alternate title is optional and repeatable.  So, this example might be modified to produce a slightly different Metadata Form:

<field_group name='main_title'>
  <field>
    <element name='title'>The main title</element>
  </field>
</field_group>
<field_group name='alternate_title'>
  <field>
    <element name='title'>An alternate title</element>
  </field>
  <field>
    <element name='title'>Another alternate title</element>
  </field>
</field_group>

Also we don't want to create static HTML forms.  It would be short-sighted to think that our metadata schema will never change, or that we will always have just one descriptive metadata schema within our repository.  Or, if someone from another university has an interest in using our editor tool, we don't want it tied to our specific metadata schema.  So, we want the HTML forms to be dynamic, meaning we want them to be generated on the fly based on the metadata coming from the repository as well as some instruction from the repository on how the metadata

  • No labels