Page tree
Skip to end of metadata
Go to start of metadata

This page hopes to document the first attempt at mapping the National Library of Wales Newspaper content to PCDM and IIIF. A diagram of the Newspaper in PCDM is below:



  • Title: is a Newspaper Title which has a record in our MARC catalogue.
  • Phase: is a physical unit of handling (for digitisation) and is usually a set of issues physically bound together. We use this information to manage batches for digitisation and is not displayed to users.
  • Issue: an issue of a Newspaper, this has a ISO issue date as metadata
  • Article: a newspaper article, this can span Pages and can have multiple columns on one page. There are many articles on a single page.
  • Page: a physical page of a Newspaper and also a container for Scanned image of a page. 
  • Archival Copy: Archival TIFF held in a HSM near line storage. Referenced over HTTP from Fedora
  • JP2: reference version of the page currently stored as a managed datastream in Fedora
  • ALTO: OCR Text, Article Boundaries and Coordinate information generated from the TIFF.


  • We've added IIIF classes and relations into the above diagram to specify the OCR text annotations on a page and the related article metadata.


  • In the Portland Common Data Model it specifies the rdfs:label on the File can be repeatable. We could only thing of a file having a single Filename, is there an example where to labels for a File might be required.
  • We haven't put it in the diagram but if a Manuscript had two orders one physical order (the order the physical material is in pre-scanning) and logical order (maybe the font covers have been moved from the back to the front). Would you have 1 object for the Manuscript and two member objects for each order? The proxies could then link the the Order Objects.
  • This is probably a IIIF question but we struggled to link the text of an article with the article object. We modeled a Newspaper Article to a range (as it can cross pages) but we couldn't see how we could add an annotation to a range as an anntotation seemed to be limited to a Canvas.

Comments welcome!

  • No labels