Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

*\[set\]*
- <c> file (10 of 10) -- unittitle "Born Digital" *\[set\]*
- - <c> item -- unitid: CM01 *\[item\]*
- - <c> item -- unitid: CM02 *\[item\]*
- - <c> item -- unitid: CM03 *\[item\]*

*\[set\]*
- <c> series -- unittitle "Series 6: Born Digital Materials" *\[set\]*


*\[set\]*
<ac:structured - macro ac:name="unmigrated - wiki-markup" ac:schema-version="1" ac:macro-id="0e70032d-484b-4c44-b0a6-d30c25732c0a"><ac:plain-text-body><![CDATA[- - <c02> file (28 of 29) -- unittitle "Computer diskettes [3.5 inch]"

Site

Collection

EAD structure / location of born digital materials [type of hydra object]

notes

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="311bd8b4-15cf-4152-8fa5-3ac21772d4c8"><ac:plain-text-body><![CDATA[

Hull

Gallagher

collection [set]
]]></ac:plain-text-body></ac:structured-macro>

 

Hull

Socialist Health Assoc.

 

 

Stanford

Xanadu

collection

Wiki Markup
Wiki Markup
Wiki Markup
Wiki Markup
Wiki Markup

1. Target "born digital" sub-level identified by <unittitle>
2. Collection only described to the container level (hard drives).
3. EAD "item" level  corresponds to target Fedora "item".
4. Item <unitid> is used as a filename stem to bind content files to Hypatia objects.

Stanford

Gould

collection -- unittitle "Stephen Jay Gould papers" unitid: M1437

Wiki Markup
Wiki Markup

EAD only goes down to the single "Born digital" series description, with no details expressed at lower levels. A rationalized directory structure and FTK output are intended to support a direct translation into Hypatia objects for both unprocessed and processed views without an intermediary EAD.

Virginia

Cheuse

 

 

Yale

Conn. Oral Histories

 

 

Yale

Love Makes a Family

 

 

Yale

Pelli

 

 

Yale

Tobin
collection

Wiki Markup

collection *\[set\]*
- <c01> series (3 of 7) -- unittitle "Accession 2004-M-088"

Wiki Markup
Wiki Markup
*\[set\]*
]]></ac:plain-text-body></ac:structured-macro> [set]
- - - <c03 file -- unitid: 2004-M-088.0001 Wiki Markup*\[item\]*
           :
- - - <c03 file -- unitid: 2004-M-088.0027" Wiki Markup*\[item\]*

1. Target sub-level identified by <unittitle>
2. Collection only described to the container level (diskettes).
3. EAD "item" level  corresponds to target Fedora "item".
4. Item <unitid> is used as a filename stem to bind content files to Hypatia objects.

Yale

Turner

 

 

Yale

Welch

 

 

...

  • They can be used as a form of entity markup, strongly typing references within a longer block of text:
     

    example

    as rendered in browser               

    from

    issue      

    action

    <titleproper>Stephen J. Gould papers
         <num>M1437</num>
    </titleproper>

    Stephen J. Gould papers M1437

    Stanford/Gould

    entity markup disappears for display; would be visible and viable for editing?

    Strip embedded markup

    <langmaterial label="Language(s):">Chiefly in <language langcode="eng" scriptcode="Latn">English</language>; some materials in
    <language langcode="fre" scriptcode="Latn">French</language>.</langmaterial>

    Chiefly in English; some materials in French.

    Yale/Welch

    ibid

    Strip embedded markup

    <unittitle>
         <title render="italic">The Panda's Thumb</title>, galley proof, Chapters 22-31
    </unittitle>

    , galley proof, Chapters 22-31

    Stanford/Gould

    <title> tag sets browser window title; is ignored as part of overall text

    Strip out embedded <title> markup

  • Complex elements in EADs can also be used for display markup:
     

    tag

    example                                                                                                                                     

    as rendered in browser

    found in

    issue

    action

    <p>

    <scopecontent><p>Original series of 4 episodes ...</p>
    <p>SG was series creator and writer ...</p>
    <p>Feature-length pilot and series opener ...</p></scopecontent>

    Original series of 4 episodes ...

    SG was series creator and writer ...

    Feature-length pilot and series opener ...

    everywhere

    Works great, but embedded markup is not desirable

    Drop initial <p> and trailing </p>; otherwise retain <p> markup for short term convenience? It would have to be encoded (e.g., &lt;) and reinterpreted on output. 

    <head>

    <bioghist id="ref141">
        <head>Biography</head>
        <p>When five-year-old Stephen Jay Gould ....</p>

    Biography

    When five-year-old Stephen Jay Gould ...

    everywhere

    Heading displayed with text; treating them as labels is preferred

    Turn <heading> into displayLabel attribute in corresponding MODS fields where possible.

    <blockquote>

     

     

    none so far

     

     

    <emph>

    <unittitle>Yale University
          <emph render="smcaps">(restricted until January 1, 2024)</emph>
    </unittitle>

    Yale University (restricted until January 1, 2024)

    Yale (numerous)

    non-html markup, ignored/lost

    strip out?

    <list>

    <arrangement id="ref7">
            :
       <list type="ordered">
          <item>
             <ref target="ref11" ns2:type="simple" ns2:actuate="onRequest" ns2:show="replace">Inventory</ref>
          </item>
          <item>
             <ref target="ref92" ns2:type="simple" ns2:actuate="onRequest" ns2:show="replace">Accession 2003-M-005</ref>
          </item>
         <item>
              <ref target="ref123" ns2:type="simple" ns2:actuate="onRequest" ns2:show="replace">Accession 2004-M-088</ref>
         </item>
       </list>
    </arrangement>

    Inventory Accession 2003-M-005 Accession 2004-M-088

    Virginia:Cheuse
       <frontmatter>
    Yale:Tobin
       <archdesc>

    non-html markup, ignored/lost

    Convert data to comma separated list

    <table>

    <table frame="none">
        <tgroup cols="3">
            <colspec colnum="1" colname="1" align="left" colwidth="50pt"/>
            <colspec colnum="2" colname="2" align="left" colwidth="50pt"/>
            <thead>
                <row>
                    <entry colname="1">Family Member</entry>
                    <entry colname="2">Spouse</entry>
                </row>
            </thead>
            <tbody>
                <row>
                    <entry colname="1">John Albee</entry>
                    <entry colname="2">Mary Delaney</entry>
                </row>
            </tbody>
        </tgroup>
    </table>

    Family Member Spouse John Albee Mary Delaney

    none (example from EAD site)

    non-html markup, ignored/lost

    convert to html <table>?
    (defer until encountered?)
    See EAD specs for tabular display

    <address>

    <repository label="Repository:">
        <corpname>Manuscripts and Archives</corpname>
        <address>
             <addressline>Sterling Memorial Library</addressline>
             <addressline>128 Wall Street</addressline>
             <addressline>P.O. Box 208240</addressline>
             <addressline>New Haven, CT 06520</addressline>
             <addressline altrender="email">Email: mssa.faq@yale.edu</addressline>
             <addressline altrender="phone">Phone: (203) 432-1735</addressline>
             <addressline altrender="fax">Fax: (203) 432-7441</addressline>
        </address>
    </repository>

    Manuscripts and Archives Sterling Memorial Library 128 Wall Street P.O. Box 208240 New Haven, CT 06520 Email: mssa.faq@yale.edu Phone: (203) 432-1735 Fax: (203) 432-7441

    Stanford
    (frontmatter)
    Yale
    (archdesc)

     

    ignore <address> in initial conversion

    <bibref>

    <bibliography encodinganalog="3.5.4">
          <bibref>HH Eckstein, The English health service (Harvard, 1959)
    JE Pater, The making of the National Health Service (London, 1981)
    John Stewart (1878-1967), Oxford Dictionary of Biography, Oxford, 2004</bibref>
    </bibliography>

    HH Eckstein, The English health service (Harvard, 1959) JE Pater, The making of the National Health Service (London, 1981) John Stewart (1878-1967), Oxford Dictionary of Biography, Oxford, 2004

    Hull:Socialist
       <frontmatter>

    Implied line breaks are ignored/lost

    Defer; not in converted data

    <title>

    <unittitle>
         <title render="italic">The Panda's Thumb</title>, galley proof, Chapters 22-31
    </unittitle>

    , galley proof, Chapters 22-31

    Stanford:Gould
       (numerous)
    Virginia:Cheuse
       (numerous)
    Yale:(several)
       (numerous)

    <title> tag sets browser window title; is ignored as part of overall text

    Strip out embedded <title> markup

...

Issue: Stanford <container> conventions and mapping into a MODS "Location" note (revised 10/24/11 to split out Collection title in item record and nest this information in a relatedItem):

We will create a concise representation of the physical/logical location (as appropriate) of the materials in the context of the collection and its hierarchy. It will be a MODS <relatedItem><physicalLocation type="location">. It will be a concatenation of the following information:

...

Assembles as "Series 6: Born Digital Materials - Box 11 - Folder 3"

Is this generalizable, across Stanford collections? across institutions?

Examples:

Collection

EAD

MODS

Gould

<c id="ref432" level="file">
    <did>
       <unittitle>Gardner, Howard</unittitle>
       <container id="cid57883022" type="Box" label="Mixed materials">4</container>
       <container parent="cid57883022" type="Folder">27</container>

<mods:relatedItem type="host">
   <mods:title>
      <mods:titleInfo>Stephen Jay Gould papers</mods:titleInfo>
    </mods:title>
   <mods:typsOfResource collection="yes"/>
</mods:relatedItem>
<mods:relatedItem type="host">
   <mods:location>
      <mods:physicalLocation type="location">Series 1: Correspondence - Box 4: Mixed materials - Folder 27</physicalLocation>
    </mods:location>
</mods:relatedItem>

was:
<mods:location>
   <physicalLocation displayLabel="Located in">Stephen Jay Gould papers - Series 1: Correspondence - Mixed materials - Box 4</physicalLocation>
</mods:location>

Hensen

<c id="ref50" level="item">
     <did>
         <unittitle>CM01</unittitle>
         <unitid>CM01</unitid>
         <container id="cid59523001" type="Carton" label="Computer disks / tapes">11</container>

<mods:relatedItem type="host">
   <mods:title>
      <mods:titleInfo>Keith Henson. Papers relating to Project Xanadu, XOC and Eric Drexler</mods:titleInfo>
    </mods:title>
   <mods:typsOfResource collection="yes"/>
</mods:relatedItem>
<mods:relatedItem type="host">
   <mods:location>
      <mods:physicalLocation type="location">Series 6: Born-Digital Materials - Carton 11: Computer disks / tapes</physicalLocation>
    </mods:location>
</mods:relatedItem>

was:
<mods:location>
   <physicalLocation displayLabel="Located in">Keith Henson. Papers relating to Project Xanadu, XOC and Eric Drexler - Born-Digital Materials - Computer disks / tapes - Carton 11</physicalLocation>
</mods:location>

Issue: Derived <mods:location> information

...

Series 6: Born Digital Materials - \ [directoryname\]</physicalLocation>
    </mods:location>
</mods:relatedItem>

was
<mods:location>
&nbsp;&nbsp; <physicalLocation    <physicalLocation displayLabel="Located in">Stephen Jay Gould papers - Series 6: Born Digital Materials - \ [directoryname\]</physicalLocation>
</mods:location>

Collection

FTK

MODS

Gould

 

<mods:relatedItem type="host">
   <mods:title>
      <mods:titleInfo>Stephen Jay Gould papers</mods:titleInfo>
    </mods:title>
   <mods:typsOfResource colleciton="yes"/>
</mods:relatedItem>
<mods:relatedItem type="host">
   <mods:location>
      <mods:physicalLocation type="location">

Wiki Markup
Wiki Markup

Issue: Recursively nested <descgrp>

...

EAD element

MODS element

Notes

Example                                                                                                  

<unittitle>

<titleInfo>
   <title>

• Requires embedded element conversion

 

<origination>
   <persname>
or
   <corpname>
    

<name type="...">
   <namePart>

• EAD/persname maps to MODS <name type="personal">
• EAD/corpname maps to MODS <name type="corporate">
• EAD/origination label=creator (case insensitive) maps to MODS/name <role> sub-element (else ignore. no other value occur in sample data)
• EAD/origination source attribute maps to MODS/name authority attribute

<origination label="creator">
    <persname rules="aacr" source="ingest">Gould, Stephen Jay</persname>
</origination>

<mods:name type="personal" authority="ingest">
     <mods:namePart>Gould, Stephen Jay</mods:namePart>
     <mods:role>
         <mods:roleTerm authority="marcrelator" type="text">creator</mods:roleTerm>
     </mods:role>
</mods:name>

<repository>

<name>

Map <repository><corpname> to
<location><physicalLocation type="repository">

Ignore other embedded elements, e.g., <address>

<repository>
      <corpname>Stanford University. Department of Special Collections and University Archives</corpname>
</repository>

<mods:location>
    <mods:physicalLocation type="repository">Stanford University. Department of Special Collections and University Archives</mods:physicalLocation>
</mods:location>

was (revised 10/24/11):
<mods:name type="corporate">
      <mods:namePart>Stanford University. Department of Special Collections and University Archives</mods:namePart>
      <mods:role>
         <mods:roleTerm authority="local" type="text">repository</mods:roleTerm>
      </mods:role>
   </mods:name>

No corresponding EAD element

<typeOfResource>

For any Hypatia set created, create an entry indicating a collection.
(Mark top collection only; intermediates series are sets but not collections)

The following values are applicable to born digital materials: "sound recording", "still image", "moving image", "software, multimedia".  Attempt to generate based on format?

<mods:typeOfResource collection="yes"/>

<controlaccess>
   <genreform>

<genre>

• EAD origination source attribute maps to MODS/genre authority attribute
• DLF guidelines suggest creating separate MODS <genre> elements from this controlaccess subelement -- one at document level and one as a subject entry.

<controlaccess>
    <genreform source="aat">Videorecordings.</genreform>

<mods:genre authority="aat">Videorecordings</mods:genre>

<unitdate>

<originInfo>
   <dateCreated>

If only one <unitdate> is present for a <did>, add attribute keydate="yes". If more than one <unitdate>, only add keydate="yes" if EAD type="inclusive".

<mods:originInfo>
     <mods:dateCreated ketDate="yes">1977-1997</mods:dateCreated>
</mods:originInfo>

<langmaterial>
   <language>

<language>
   <languageTerm>

For <langmaterial>
• If no <language> sub-element, map  <langmaterial>#PCDATA to <langTerm type="text">#PCDATA
• else map each <language> sub-element to a separate MODS <langTerm> element and ignore any #PCDATA

For <language>
• If langcode attribute and no #PCDATA, create
        <languageTerm type="code" authority="iso639-2b"> (Stanford: prefer converting this to text)
• If #PCDATA and no langcode attribute, create
        <languageTerm type="text">
if both, create only the Text form
• Ignore scriptcode

<langmaterial label="Language(s):">The materials are in <language langcode="eng" scriptcode="Latn">English</language>.</langmaterial>

<mods:language>
    <mods:languageTerm type="Text">English</mods:languageTerm>
</mods:language>
-----
<langmaterial>
    <language langcode="eng"/>
</langmaterial>

<mods:language>
    <mods:languageTerm authority="iso639-2b" type="code">eng</mods:languageTerm>
</mods:language>

No corresponding EAD element

<physicalDescription>
   <digitalOrigin>

Add a "born digital" indication only for the born digital items in the collection, else omit.

<mods:physicalDescription>
      <mods:digitalOrigin>born digital</mods:digitalOrigin>

<physdesc>
   <extent>

<physicalDescription>
   <extent>

• Each EAD <extent> subelement will become a MODS/extent element
• If EAD <physdesc> has no sub-elements, map its #PCDATA into MODS/extent

<physdesc>
      <extent>1.0 computer media</extent>
      <extent>hard drive</extent>
</physdesc>

<mods:physicalDescription>
      <mods:extent>1.0 computer media</mods:extent>
      <mods:extent>hard drive</mods:extent>
</mods:physicalDescription>
-----
<physdesc label="Physical Characteristics">This collection consists of ca. 3,200 items</physdesc>

<mods:physicalDescription>
      <mods:extent>This collection consists of ca. 3,200 items</mods:extent>
</mods:physicalDescription>

<abstract> or <scopecontent>

<abstract>

Map EAD label attribute to MODS displayLabel attribute

Note DLF guidelines suggests the first paragraph of <scopecontent> could be used as an abstract, but it does not otherwise map <scopecontent>. Recommend a simple clean mapping of each as described here. 

<abstract label="Summary:">The papers consist of correspondence, subject files, and writings, primarily documenting the professional career and personal life of James Tobin as an economist and educator.</abstract>

<mods:abstract displayLabel="Summary:">The papers consist of correspondence, subject files, and writings, primarily documenting the professional career and personal life of James Tobin as an economist and educator.</mods:abstract>

<descgrp>
<bioghist>
<acqinfo>
<prefercite>
<userestrict>
<processinfo>
<note>

<note>

• Requires embedded element conversion
• Ignore a wrapping <desdgrp type="admininfo">
• They should be converted to notes in the order encountered in the EAD.
• A leading <head> value should map to the MODS displayLabel attribute, else provide a default displayLabel as follows:

  • <bioghist> = "Biography"
  • <acqinfo> = "Acquisition Information"
  • <prefercite> = "Preferred Citation"
  • <userestrict> = "Use restrictions"
  • <processinfo> = "Processing information"
  • <note> = "Note"

<prefercite id="ref6">
    <head>Cite As</head>
    <p>James Tobin Papers. Manuscripts and Archives, Yale University Library.</p>
</prefercite>

<mods:note displayLabel="Cite As">James Tobin Papers. Manuscripts and Archives, Yale University Library.</mods:note>

<arrangement>

<tableOfContents>

Mapping per DLF guidelines, with default displayLabel of "Arrangement".

<arrangement id="ref206">
     <head>Arrangement note</head>
     <p>The records are arranged in three series: I. Administrative Records,1991-2010. II. Audiovisual Recordings ...</p>
</arrangement>

<tableOfContents displayLabel="Arrangement note">The records are arranged in three series: I. Administrative Records,1991-2010. II. Audiovisual Recordings</tableOfContents>

No corresponding EAD element

<targetAudience>

mapping not applied to sample EADs

 

<odd>

<note>

not found in sample EADs

 

<controlaccess> with
   <corpname>
   <famname>
   <function>
   <genreform>
   <geogname>
   <name>
   <occupation>
   <persname>
   <subject>,
   <title>

<subject> with
   <topic>
   <geographic>
   <temporal>
   <titleInfo>
   <name>
   <genre>
   <hierarchicalGeographic>
   <cartographics>
   <geographicCode>
   <occupation>

Mappings of EAD <controlaccess> subelements to MODS's <subject> subelements:
• EAD <corpname> = MODS <name type="corporate">;
• EAD <famname> = MODS <name type="personal">;
• EAD <function> = MODS <topic> with no @authority attribute on <subject>;
• EAD <genreform> = MODS <genre>;
• EAD <geogname> = MODS <geographic>;
• EAD <name> = MODS <name>;
• EAD <occupation> = MODS <occupation>;
• EAD <persname> = MODS <name type="personal">;
• EAD <subject> = MODS <topic>; and
• EAD <title> = MODS <titleInfo>.

Map EAD source attribute for any <controlaccess> subelement to MODS authority attribute on <subject>.

<controlaccess>
   <persname rules="aacr">Lucas, Arel</persname>
   <corpname rules="dacs" source="ingest">Xanadu Operating Company (XOC)</corpname>
    <subject source="lcsh">Electronic publishing.</subject>
    <genreform source="aat">Videorecordings.</genreform>
    <subject source="lcsh">Word processing.</subject>
</controlaccess>

<mods:subject>
    <mods:name type="personal">
        <namePart>Lucas, Arel</namePart>
    </mods:name>
</mods:subject>
<mods:subject authority="ingest">
    <mods:name type="corporate">
        <namePart>Xanadu Operating Company (XOC)</namePart>
    </mods:name>
</mods:subject>
<mods:subject authority="lcsh">
    <mods:topic>Electronic publishing.</mods:topic>
</mods:subject>
<mods:subject authority="aat">
    <mods:genre>Videorecordings.</mods:topic>
</mods:subject>
<mods:subject authority="lcsh">
    <mods:topic>Word processing.</mods:topic>
</mods:subject>

No corresponding EAD element

<classification>

No mapping in samples

 

No corresponding EAD element

<relatedItem>

No mapping in samples

 

<unitid>

<identifier>

• All mapped to identifier of type=unitid
• EAD label attribute mapped to MODS displayLabel attribute

<unitid>M1437</unitid>

<identifier type="unitid">M1437</identifier>
-----
<unitid label="Call Number:" countrycode="US" repositorycode="US-CtY">MS 1746</unitid>

<identifier type="unitid" displayLabel="Call Number:">MS 1746</identifier>

No corresponding EAD element

<location><url>

No candidate sample data, through conversions could provide useful additions for born digital materials

 

<accessrestrict>

<accessConditions>

• Requires embedded element conversion
• Map <head> subelement to MODS attribute displayLabel
• Apply attribute type="restrictionOnAccess"

<accessrestrict id="ref5713">
    <head>Access to Collection</head>
    <p>Open for research. Audio-visual materials are not available in original format...</p>
</accessrestrict>

<accessConditions type="restrictionOnAccess" displayLabel="Access to Collection">Open for research. Audio-visual materials are not available in original format...</accessConditions>

<userestrict>

<accessCondition>

• Requires embedded element conversion
• Map <head> subelement to MODS attribute displayLabel
• Apply attribute type="useAndReproduction"

<userestrict id="ref5">
    <head>Publication Rights</head>
    <p>Property rights reside with the repository. Literary rights reside with the creators of the document....</p>
</userestrict>

<accessConditions type="useAndReproduction" displayLabel="Publication Rights">Property rights reside with the repository. Literary rights reside with the creators of the document....</accessConditions>