Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

*\[set\]*
- <c> file (10 of 10) -- unittitle "Born Digital" *\[set\]* [set]
- - <c> item -- unitid: CM01 *\[item\]*
- - <c> item -- unitid: CM02 *\[item\]*
- - <c> item -- unitid: CM03 *\[item\]*

*\[set\]*
- <c> series -- unittitle "Series 6: Born Digital Materials" *\[set\]*

*\[set\]*
- <c01> series (3 of 7) -- unittitle "Accession 2004-M-088" [set]
unmigrated - wiki-markup <ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="04f91800-1939-401e-936b-6b397d1d0043"><ac:plain-text-body><![CDATA[- - <c02> - <c02> file (28 of 29) -- unittitle "Computer diskettes [3.5 inch]" *\[set\]*
]]></ac:plain - text - body></ac:structured - macro>
- - - <c03 file -- unitid: 2004-M-088.0001 *\[item\]*
           :
- - - <c03 file -- unitid: 2004-M-088.0027" *\[item\]*

Site

Collection

EAD structure / location of born digital materials [type of hydra object]

notes

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="f4ea6318-5e47-4abe-a622-ffee2c0c7ee2"><ac:plain-text-body><![CDATA[

Hull

Gallagher

collection [set]
]]></ac:plain-text-body></ac:structured-macro>

 

Hull

Socialist Health Assoc.

 

 

Stanford

Xanadu

collection

Wiki Markup
Wiki Markup
Wiki Markup
Wiki Markup
Wiki Markup

1. Target "born digital" sub-level identified by <unittitle>
2. Collection only described to the container level (hard drives).
3. EAD "item" level  corresponds to target Fedora "item".
4. Item <unitid> is used as a filename stem to bind content files to Hypatia objects.

Stanford

Gould

collection -- unittitle "Stephen Jay Gould papers" unitid: M1437

Wiki Markup
Wiki Markup

EAD only goes down to the single "Born digital" series description, with no details expressed at lower levels. A rationalized directory structure and FTK output are intended to support a direct translation into Hypatia objects for both unprocessed and processed views without an intermediary EAD.

Virginia

Cheuse

 

 

Yale

Conn. Oral Histories

 

 

Yale

Love Makes a Family

 

 

Yale

Pelli

 

 

Yale

Tobin

collection

Wiki Markup
*\[set\]*
Wiki Markup
Wiki Markup
Wiki Markup

1. Target sub-level identified by <unittitle>
2. Collection only described to the container level (diskettes).
3. EAD "item" level  corresponds to target Fedora "item".
4. Item <unitid> is used as a filename stem to bind content files to Hypatia objects.

Yale

Turner

 

 

Yale

Welch

 

 

...

  • They can be used as a form of entity markup, strongly typing references within a longer block of text:
     

    example

    as rendered in browser               

    from

    issue      

    action

    <titleproper>Stephen J. Gould papers
         <num>M1437</num>
    </titleproper>

    Stephen J. Gould papers M1437

    Stanford/Gould

    entity markup disappears for display; would be visible and viable for editing?

    Strip embedded markup

    <langmaterial label="Language(s):">Chiefly in <language langcode="eng" scriptcode="Latn">English</language>; some materials in
    <language langcode="fre" scriptcode="Latn">French</language>.</langmaterial>

    Chiefly in English; some materials in French.

    Yale/Welch

    ibid

    Strip embedded markup

    <unittitle>
         <title render="italic">The Panda's Thumb</title>, galley proof, Chapters 22-31
    </unittitle>

    , galley proof, Chapters 22-31

    Stanford/Gould

    <title> tag sets browser window title; is ignored as part of overall text

    Strip out embedded <title> markup

  • Complex elements in EADs can also be used for display markup:
     

    tag

    example                                                                                                                                     

    as rendered in browser

    found in

    issue

    action

    <p>

    <scopecontent><p>Original series of 4 episodes ...</p>
    <p>SG was series creator and writer ...</p>
    <p>Feature-length pilot and series opener ...</p></scopecontent>

    Original series of 4 episodes ...

    SG was series creator and writer ...

    Feature-length pilot and series opener ...

    everywhere

    Works great, but embedded markup is not desirable

    Drop initial <p> and trailing </p>; otherwise retain <p> markup for short term convenience? It would have to be encoded (e.g., &lt;) and reinterpreted on output. 

    <head>

    <bioghist id="ref141">
        <head>Biography</head>
        <p>When five-year-old Stephen Jay Gould ....</p>

    Biography

    When five-year-old Stephen Jay Gould ...

    everywhere

    Heading displayed with text; treating them as labels is preferred

    Turn <heading> into displayLabel attribute in corresponding MODS fields where possible.

    <blockquote>

     

     

    none so far

     

     

    <emph>

    <unittitle>Yale University
          <emph render="smcaps">(restricted until January 1, 2024)</emph>
    </unittitle>

    Yale University (restricted until January 1, 2024)

    Yale (numerous)

    non-html markup, ignored/lost

    strip out?

    <list>

    <arrangement id="ref7">
            :
       <list type="ordered">
          <item>
             <ref target="ref11" ns2:type="simple" ns2:actuate="onRequest" ns2:show="replace">Inventory</ref>
          </item>
          <item>
             <ref target="ref92" ns2:type="simple" ns2:actuate="onRequest" ns2:show="replace">Accession 2003-M-005</ref>
          </item>
         <item>
              <ref target="ref123" ns2:type="simple" ns2:actuate="onRequest" ns2:show="replace">Accession 2004-M-088</ref>
         </item>
       </list>
    </arrangement>

    Inventory Accession 2003-M-005 Accession 2004-M-088

    Virginia:Cheuse
       <frontmatter>
    Yale:Tobin
       <archdesc>

    non-html markup, ignored/lost

    Convert data to comma separated list

    <table>

    <table frame="none">
        <tgroup cols="3">
            <colspec colnum="1" colname="1" align="left" colwidth="50pt"/>
            <colspec colnum="2" colname="2" align="left" colwidth="50pt"/>
            <thead>
                <row>
                    <entry colname="1">Family Member</entry>
                    <entry colname="2">Spouse</entry>
                </row>
            </thead>
            <tbody>
                <row>
                    <entry colname="1">John Albee</entry>
                    <entry colname="2">Mary Delaney</entry>
                </row>
            </tbody>
        </tgroup>
    </table>

    Family Member Spouse John Albee Mary Delaney

    none (example from EAD site)

    non-html markup, ignored/lost

    convert to html <table>?
    (defer until encountered?)
    See EAD specs for tabular display

    <address>

    <repository label="Repository:">
        <corpname>Manuscripts and Archives</corpname>
        <address>
             <addressline>Sterling Memorial Library</addressline>
             <addressline>128 Wall Street</addressline>
             <addressline>P.O. Box 208240</addressline>
             <addressline>New Haven, CT 06520</addressline>
             <addressline altrender="email">Email: mssa.faq@yale.edu</addressline>
             <addressline altrender="phone">Phone: (203) 432-1735</addressline>
             <addressline altrender="fax">Fax: (203) 432-7441</addressline>
        </address>
    </repository>

      Manuscripts and Archives Sterling Memorial Library 128 Wall Street P.O. Box 208240 New Haven, CT 06520 Email: mssa.faq@yale.edu Phone: (203) 432-1735 Fax: (203) 432-7441

    Stanford
    (frontmatter)
    Yale
    (archdesc)

     

    ignore <address> in initial conversion

    <bibref>

    <bibliography encodinganalog="3.5.4">
          <bibref>HH Eckstein, The English health service (Harvard, 1959)
    JE Pater, The making of the National Health Service (London, 1981)
    John Stewart (1878-1967), Oxford Dictionary of Biography, Oxford, 2004</bibref>
    </bibliography>

    HH Eckstein, The English health service (Harvard, 1959) JE Pater, The making of the National Health Service (London, 1981) John Stewart (1878-1967), Oxford Dictionary of Biography, Oxford, 2004

    Hull:Socialist
       <frontmatter>

    Implied line breaks are ignored/lost

    Defer; not in converted data

    <title>

    <unittitle>
         <title render="italic">The Panda's Thumb</title>, galley proof, Chapters 22-31
    </unittitle>

    , galley proof, Chapters 22-31

    Stanford:Gould
       (numerous)
    Virginia:Cheuse
       (numerous)
    Yale:(several)
       (numerous)

    <title> tag sets browser window title; is ignored as part of overall text

    Strip out embedded <title> markup

...

Issue: Tags that have no mapping into MODS

With one exception, we wil will map these into Notes, using displayLabel to lat let them appear with specific labels in the Hypatia display.

  • <scopecontent> -- map to MODS <abstract> per DLF Guidelines.
  • <bioghist> -- map to MODS <note>
  • <custodhist> -- map to MODS <note>
  • <relatedmaterial> -- map to MODS <note>
  • <otherfindaid> -- map to MODS <note>
  • <bibliography> -- map to MODS <note>
  • <processinfo> -- map to MODS <note>

Conversion rule (Stanford): Use of <head> at the beginning of text fields as a labeling convention ...

...

Issue: Stanford <container> conventions and mapping into a MODS "Located in" noteLocation" note (revised 10/24/11 to split out Collection title in item record and nest this information in a relatedItem):

We will create a concise representation of the physical/logical location (as appropriate) of the materials in the context of the collection and its hierarchy. It will be a single string concatenatingMODS <relatedItem><physicalLocation type="location">. It will be a concatenation of the following information:

  • Series and
  • Collection name
  • Intermediate series, subseries names etc if presentpresent  -- e.g., Series 6: Born Digital Materials
  • The container label
  • The container type + value
  • type (box, map case, etc) and ID -- e.g., Box 11
  • A sub-container type + value, down to the level of the item -- e.g., Folder 3

Assembles as "Series 6: Born Digital Materials - Box 11 - Folder 3"

Is this generalizable, Is this generalizable, across Stanford collections? across institutions?

Examples:

Collection

EAD

MODS

Gould

<c id="ref432" level="file">
    <did>
       <unittitle>Gardner, Howard</unittitle>
       <container id="cid57883022" type="Box" label="Mixed materials">4</container>
       <container parent="cid57883022" type="Folder">27</container>

<mods:location>
   <physicalLocation displayLabelrelatedItem type="Located in">Stephen Jay Gould papers - Series 1: Correspondence - Mixed materials - Box 4</physicalLocation>
host">
   <mods:title>
      <mods:titleInfo>Stephen Jay Gould papers</mods:titleInfo>
    </mods:location>

Hensen

<c idtitle>
   <mods:typsOfResource collection="ref50" level="item">
     <did>
         <unittitle>CM01</unittitle>
         <unitid>CM01</unitid>
         <container id="cid59523001" type="Carton" label="Computer disks / tapes">11</container> <mods:location>
   <physicalLocation displayLabel="Located in">Keith Henson. Papers relating to Project Xanadu, XOC and Eric Drexler - Born-Digital Materials - Computer disks / tapes - Carton 11</physicalLocation>
yes"/>
</mods:relatedItem>
<mods:relatedItem type="host">
   <mods:location>
      <mods:physicalLocation type="location">Series 1: Correspondence - Box 4: Mixed materials - Folder 27</physicalLocation>
    </mods:location>

...

</mods:relatedItem>

was:
<mods:location>
   <physicalLocation displayLabel="Located in">Stephen Jay Gould papers - Series 1: Correspondence - Mixed materials - Box 4</physicalLocation>
</mods:location>

Hensen

<c id="ref50" level="item">
     <did>
         <unittitle>CM01</unittitle>
         <unitid>CM01</unitid>
         <container id="cid59523001" type="Carton" label="Computer disks / tapes">11</container>

<mods:relatedItem type="host">
   <mods:title>
      <mods:titleInfo>Keith Henson. Papers relating to Project Xanadu, XOC and Eric Drexler</mods:titleInfo>
    </mods:title>
   <mods:typsOfResource collection="yes"/>
</mods:relatedItem>
<mods:relatedItem type="host">
   <mods:location>
      <mods:physicalLocation type="location">Series 6: Born-Digital Materials - Carton 11: Computer disks / tapes</physicalLocation>
    </mods:location>
</mods:relatedItem>

was:
<mods:location>
   <physicalLocation displayLabel="Located in">Keith Henson. Papers relating to Project Xanadu, XOC and Eric Drexler - Born-Digital Materials - Computer disks / tapes - Carton 11</physicalLocation>
</mods:location>

Issue: Derived <mods:location> information

Where all items objects are derived from FTK information about files in a directory, how is this logical_physical location informationWhere all items objects are derived from FTK information about files in a directory, how is this logial_physical locaiton information assembled and presented?

Collection

FTK

MODS

Gould

 

<mods:relatedItem type="host">
   <mods:title>
      <mods:titleInfo>Stephen Jay Gould papers</mods:titleInfo>
    </mods:title>
   <mods:typsOfResource colleciton="yes"/>
</mods:relatedItem>
<mods:relatedItem type="host">
  

Gould

 

<mods:location>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="082e69d2-cbee-4a4f-a944-5ea2c76a03ca"><ac:plain-text-body><![CDATA[       <mods:physicalLocation type="location">Series 6: Born Digital Materials - [directoryname]</physicalLocation>
    </mods:location>
</mods:relatedItem>

was
<mods:location>
   <physicalLocation displayLabel="Located in">Stephen Jay Gould papers - Series 6: Born Digital Materials - [directoryname]</physicalLocation>
]]></ac:plain-text-body></ac:structured-macro>
</mods:location>

Issue: Recursively nested <descgrp>

...

EAD element

MODS element

Notes

Example                                                                                                  

<unittitle>

<titleInfo>
   <title>

• Requires embedded element conversion

 

<origination>
   <persname>
or
   <corpname>
    

<name type="...">
   <namePart>

• EAD/persname maps to MODS <name type="personal">
• EAD/corpname maps to MODS <name type="corporate">
• EAD/origination label=creator (case insensitive) maps to MODS/name <role> sub-element (else ignore. no other value occur in sample data)
• EAD/origination source attribute maps to MODS/name authority attribute

<origination label="creator">
    <persname rules="aacr" source="ingest">Gould, Stephen Jay</persname>
</origination>

<mods:name type="personal" authority="ingest">
     <mods:namePart>Gould, Stephen Jay</mods:namePart>
     <mods:role>
         <mods:roleTerm authority="marcrelator" type="text">creator</mods:roleTerm>
     </mods:role>
</mods:name>

<repository>

<name>

Map <corpname> only to a corporate name with role=repository. <repository><corpname> to
<location><physicalLocation type="repository">

Ignore other embedded elements, e.g., <address>

<repository>
      <corpname>Stanford University. Department of Special Collections and University Archives</corpname>
</repository>

<mods:location>
    <mods:physicalLocation type="repository">Stanford University. Department of Special Collections and University Archives</mods:physicalLocation>
</mods:location>

was (revised 10/24/11):
<mods:name type="corporate">
      <mods:namePart>Stanford University. Department of Special Collections and University Archives</mods:namePart>
      <mods:role>
         <mods:roleTerm authority="local" type="text">repository</mods:roleTerm>
      </mods:role>
   </mods:name>

No corresponding EAD element

<typeOfResource>

For any Hypatia set created, create an entry indicating a collection.
(Mark top collection only; intermediates series are sets but not collections)

The following values are applicable to born digital materials: "sound recording", "still image", "moving image", "software, multimedia".  Attempt to generate based on format?

<mods:typeOfResource collection="yes"/>

<controlaccess>
   <genreform>

<genre>

• EAD origination source attribute maps to MODS/genre authority attribute
• DLF guidelines suggest creating separate MODS <genre> elements from this controlaccess subelement -- one at document level and one as a subject entry.

<controlaccess>
    <genreform source="aat">Videorecordings.</genreform>

<mods:genre authority="aat">Videorecordings</mods:genre>

<unitdate>

<originInfo>
   <dateCreated>

If only one <unitdate> is present for a <did>, add attribute keydate="yes". If more than one <unitdate>, only add keydate="yes" if EAD type="inclusive".

<mods:originInfo>
     <mods:dateCreated ketDate="yes">1977-1997</mods:dateCreated>
</mods:originInfo>

<langmaterial>
   <language>

<language>
   <languageTerm>

For <langmaterial>
• If no <language> sub-element, map  <langmaterial>#PCDATA to <langTerm type="text">#PCDATA
• else map each <language> sub-element to a separate MODS <langTerm> element and ignore any #PCDATA

For <language>
• If langcode attribute and no #PCDATA, create
        <languageTerm type="code" authority="iso639-2b"> (Stanford: prefer converting this to text)
• If #PCDATA and no langcode attribute, create
        <languageTerm type="text">
if both, create only the Text form
• Ignore scriptcode

<langmaterial label="Language(s):">The materials are in <language langcode="eng" scriptcode="Latn">English</language>.</langmaterial>

<mods:language>
    <mods:languageTerm type="Text">English</mods:languageTerm>
</mods:language>
-----
<langmaterial>
    <language langcode="eng"/>
</langmaterial>

<mods:language>
    <mods:languageTerm authority="iso639-2b" type="code">eng</mods:languageTerm>
</mods:language>

No corresponding EAD element

<physicalDescription>
   <digitalOrigin>

Add a "born digital" indication only for the born digital items in the collection, else omit.

<mods:physicalDescription>
      <mods:digitalOrigin>born digital</mods:digitalOrigin>

<physdesc>
   <extent>

<physicalDescription>
   <extent>

• Each EAD <extent> subelement will become a MODS/extent element
• If EAD <physdesc> has no sub-elements, map its #PCDATA into MODS/extent

<physdesc>
      <extent>1.0 computer media</extent>
      <extent>hard drive</extent>
</physdesc>

<mods:physicalDescription>
      <mods:extent>1.0 computer media</mods:extent>
      <mods:extent>hard drive</mods:extent>
</mods:physicalDescription>
-----
<physdesc label="Physical Characteristics">This collection consists of ca. 3,200 items</physdesc>

<mods:physicalDescription>
      <mods:extent>This collection consists of ca. 3,200 items</mods:extent>
</mods:physicalDescription>

<abstract> or <scopecontent>

<abstract>

Map EAD label attribute to MODS displayLabel attribute

Note DLF guidelines suggests the first paragraph of <scopecontent> could be used as an abstract, but it does not otherwise map <scopecontent>. Recommend a simple clean mapping of each as described here. 

<abstract label="Summary:">The papers consist of correspondence, subject files, and writings, primarily documenting the professional career and personal life of James Tobin as an economist and educator.</abstract>

<mods:abstract displayLabel="Summary:">The papers consist of correspondence, subject files, and writings, primarily documenting the professional career and personal life of James Tobin as an economist and educator.</mods:abstract>

<descgrp>
<bioghist>
<acqinfo>
<prefercite>
<userestrict>
<processinfo>
<note>

<note>

• Requires embedded element conversion
• Ignore a wrapping <desdgrp type="admininfo">
• They should be converted to notes in the order encountered in the EAD.
• A leading <head> value should map to the MODS displayLabel attribute, else provide a default displayLabel as follows:

  • <bioghist> = "Biography"
  • <acqinfo> = "Acquisition Information"
  • <prefercite> = "Preferred Citation"
  • <userestrict> = "Use restrictions"
  • <processinfo> = "Processing information"
  • <note> = "Note"

<prefercite id="ref6">
    <head>Cite As</head>
    <p>James Tobin Papers. Manuscripts and Archives, Yale University Library.</p>
</prefercite>

<mods:note displayLabel="Cite As">James Tobin Papers. Manuscripts and Archives, Yale University Library.</mods:note>

<arrangement>

<tableOfContents>

Mapping per DLF guidelines, with default displayLabel of "Arrangement".

<arrangement id="ref206">
     <head>Arrangement note</head>
     <p>The records are arranged in three series: I. Administrative Records,1991-2010. II. Audiovisual Recordings ...</p>
</arrangement>

<tableOfContents displayLabel="Arrangement note">The records are arranged in three series: I. Administrative Records,1991-2010. II. Audiovisual Recordings</tableOfContents>

No corresponding EAD element

<targetAudience>

mapping not applied to sample EADs

 

<odd>

<note>

not found in sample EADs

 

<controlaccess> with
   <corpname>
   <famname>
   <function>
   <genreform>
   <geogname>
   <name>
   <occupation>
   <persname>
   <subject>,
   <title>

<subject> with
   <topic>
   <geographic>
   <temporal>
   <titleInfo>
   <name>
   <genre>
   <hierarchicalGeographic>
   <cartographics>
   <geographicCode>
   <occupation>

Mappings of EAD <controlaccess> subelements to MODS's <subject> subelements:
• EAD <corpname> = MODS <name type="corporate">;
• EAD <famname> = MODS <name type="personal">;
• EAD <function> = MODS <topic> with no @authority attribute on <subject>;
• EAD <genreform> = MODS <genre>;
• EAD <geogname> = MODS <geographic>;
• EAD <name> = MODS <name>;
• EAD <occupation> = MODS <occupation>;
• EAD <persname> = MODS <name type="personal">;
• EAD <subject> = MODS <topic>; and
• EAD <title> = MODS <titleInfo>.

Map EAD source attribute for any <controlaccess> subelement to MODS authority attribute on <subject>.

<controlaccess>
   <persname rules="aacr">Lucas, Arel</persname>
   <corpname rules="dacs" source="ingest">Xanadu Operating Company (XOC)</corpname>
    <subject source="lcsh">Electronic publishing.</subject>
    <genreform source="aat">Videorecordings.</genreform>
    <subject source="lcsh">Word processing.</subject>
</controlaccess>

<mods:subject>
    <mods:name type="personal">
        <namePart>Lucas, Arel</namePart>
    </mods:name>
</mods:subject>
<mods:subject authority="ingest">
    <mods:name type="corporate">
        <namePart>Xanadu Operating Company (XOC)</namePart>
    </mods:name>
</mods:subject>
<mods:subject authority="lcsh">
    <mods:topic>Electronic publishing.</mods:topic>
</mods:subject>
<mods:subject authority="aat">
    <mods:genre>Videorecordings.</mods:topic>
</mods:subject>
<mods:subject authority="lcsh">
    <mods:topic>Word processing.</mods:topic>
</mods:subject>

No corresponding EAD element

<classification>

No mapping in samples

 

No corresponding EAD element

<relatedItem>

No mapping in samples

 

<unitid>

<identifier>

• All mapped to identifier of type=unitid
• EAD label attribute mapped to MODS displayLabel attribute

<unitid>M1437</unitid>

<identifier type="unitid">M1437</identifier>
-----
<unitid label="Call Number:" countrycode="US" repositorycode="US-CtY">MS 1746</unitid>

<identifier type="unitid" displayLabel="Call Number:">MS 1746</identifier>

No corresponding EAD element

<location><url>

No candidate sample data, through conversions could provide useful additions for born digital materials

 

<accessrestrict>

<accessConditions>

• Requires embedded element conversion
• Map <head> subelement to MODS attribute displayLabel
• Apply attribute type="restrictionOnAccess"

<accessrestrict id="ref5713">
    <head>Access to Collection</head>
    <p>Open for research. Audio-visual materials are not available in original format...</p>
</accessrestrict>

<accessConditions type="restrictionOnAccess" displayLabel="Access to Collection">Open for research. Audio-visual materials are not available in original format...</accessConditions>

<userestrict>

<accessCondition>

• Requires embedded element conversion
• Map <head> subelement to MODS attribute displayLabel
• Apply attribute type="useAndReproduction"

<userestrict id="ref5">
    <head>Publication Rights</head>
    <p>Property rights reside with the repository. Literary rights reside with the creators of the document....</p>
</userestrict>

<accessConditions type="useAndReproduction" displayLabel="Publication Rights">Property rights reside with the repository. Literary rights reside with the creators of the document....</accessConditions>