Introduction
The LD4L ontology team formed very early in the project and has met on a weekly basis for discussions on a wide range of topics, from proposing possible use cases to reviewing the ontology aspects of use cases proposed by other teams to discussing the specifics of how best to represent the data coming from our three library catalogs and from other internal and external sources.
...
Panel | ||||
---|---|---|---|---|
| ||||
SRSIS OntologyBecause no existing ontology supports the range of entities and relationship that SRSIS will encompass, we will use the Protégé ontology editor to develop a SRSIS ontology framework that reuses appropriate parts of currently available ontologies while introducing extensions and additions where necessary. The framework will be based on and remain compatible with the existing VIVO and emerging research dataset and research resource ontology work. It will be sufficiently expressive to encompass traditional catalog metadata from both Cornell and Harvard, the basic linked data elements described in the Stanford Linked Data Workshop Technology Plan, and the usage and other contextual elements from StackLife. The ontology will capture a series of basic concepts and be structured as modules that draw inspiration from and reuse existing ontology classes and properties where appropriate, such as the Semantic Publishing and Referencing ontologies, and that also support arbitrary system-wide refinement, including local extensions. |
Ontology team activities to date
Local vs. global identifiers
The ability to directly link resources in our three libraries and to extend that linking arbitrarily in the future is a central premise of the LD4L project. Local resources and local authorities will continue to need stable identifiers, with the increasing expectation that these identifiers will be URIs directly dereferenceable from anywhere on the Web. These resources may be directly interlinked across institutions as special relationships are discovered, as for example between members of similar special collections across two or more libraries. However, we see OCLC's linked data initiatives in general and stable global identifiers for works in particular as an essential enabling resource that bring together multiple manifestations of a work into one entity. When local library resources share relationships to these global work identifiers, querying these relationships will reveal many further cross-library linkages that can significantly enrich local searches and collections, either on the fly or through deeper analysis.
...
- OCLC Linked Data. OCLC Developer Network; accessed 2/8/2015.
- OCLC Releases WorldCat Works as Linked Data. News release, 28 April 2014.
Strings to things
Connecting library metadata with linked data 'in the wild' is a central goal of the LD4L project. To that end much of the ontology team's work has focused on identifying external authorities, stable identifiers (preferably URIs), and sources and services capable of linking the people, places, organizations, events, and subject headings in library metadata to real world entities. In some cases existing metadata in both MARC and non-MARC metadata includes references to local or external authorities, but the vast majority of potentially identifiable entities are represented only as strings of characters. Some of our catalog records have been linked to Library of Congress, OCLC (including the VIAF international authority file), or ISNI identifiers through contracts or internal record enhancement projects, and an unrelated project at Harvard has focused on entity recognition within Encoded Archival Description (EAD) collections. A need to extend from authority file links or a registry of named entities to resolvable URIs compatible with linked data has motivated several LD4L investigations, with some focusing on quality and others more on the efficacy of existing services.
- International Standard Name Identifier (ISNI)
- Library of Congress Linked Data Service
- Virtual International Authority File
- ORCID
- Encoded Archival Description and EAD Linking Elements
Converting MARC to RDF
For MARC metadata, the team has worked with the Library of Congress BIBFRAME converter as a central component in a workflow that may include pre-processing to address variations in local MARC cataloging practice and in most cases will also require post-processing to produce data ready for consumption and interoperability with other linked data on the Web. While the conversions to BIBFRAME of a range of some 30 record types have been explored in concert with technical services staff at our three libraries, the ontology team has focused primarily on the availability and representation of data pertinent to the LD4L use cases rather than analyzing converter output across the board to ascertain completeness and correctness.
...
- Common Ground: Exploring Compatibilities Between the Linked Data Models of the Library of Congress and OCLC. Jean Godby and Ray Denenberg. This provides a high-level comparison between the LoC BIBFRAME approach and the work that OCLC has been doing on expressing bibliographic metadata in Schema.org.
- The Relationship between BIBFRAME and the OCLC's Linked-Data Model of Bibliographic Description: A Working Paper. Jean Godby, Senior Research Scientist, OCLC Research, September, 2013.
- Bibliographic Framework Initiative
- Technical site for the Bibliographic Framework Initiative (bibframe.org)
- BIBFRAME primer document (PDF)
- BIBFRAME master RDF file (December 10, 2014)
Addressing complexity
Several levels of complexity may legitimately exist in parallel and be utilized based on the availability of data or the goals of an application. This choice can be seen in PROV-O ontology where direct object properties have been paired with more complex options involving intermediate nodes that add additional temporal or role information. The related PAV (Provenance, Attribution, and Versioning) ontology offers a simpler set of classes and properties sufficient for many applications requiring only simple attribution. Application software can also often mask a more complex underlying data model, and in many cases it may be preferable in production contexts to separate logging and provenance information from user-facing applications entirely.
...
- David Weinberger: "A Good, Dumb Way to Learn from Libraries" from the Chronicle of Higher Education, October 7, 2014. This provides motivation and explanation for a simple usage metric that protects user privacy.
Expressing bibliographic metadata to support discovery
The LD4L Use Cases largely target ways to supplement traditional library catalog metadata, whether through linkages to external identities and resources, connecting catalog records with other digital collections, or adding usage metrics and annotations. Some use cases suggest new functionality by leveraging this new "library graph" in services that go beyond text-based matching of search terms to suggest deeper connections and even loop back from externally linked entities to additional local resources or related resources in other libraries.
...