Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Ontology team activities to date

Local vs. global identifiers

The ability to directly link resources in our three libraries and to extend that linking arbitrarily in the future is a central premise of the LD4L project. Local resources and local authorities will continue to need stable identifiers, with the increasing expectation that these identifiers will be URIs directly dereferenceable from anywhere on the Web. These resources may be directly interlinked across institutions as special relationships are discovered, as for example between members of similar special collections across two or more libraries.  However, we see OCLC's linked data initiatives in general and stable global identifiers for works in particular as an essential enabling resource that bring together multiple manifestations of a work into one entity.  When local library resources share relationships to these global work identifiers, querying these relationships will reveal many further cross-library linkages that can significantly enrich local searches and collections, either on the fly or through deeper analysis.

However, realizing these goals will require scalable, publicly-accessible services for discovering works identifiers from locally-held WorldCat bibliographic identifiers. These services must also support mining the full content accessible through works records including their embedded linkages to other entities maintained at OCLC or through external authorities.

Strings to things

Connecting library metadata with linked data 'in the wild' is a central goal of the LD4L project.  To that end much of the ontology team's work has focused on identifying external authorities, stable identifiers (preferably URIs), and sources and services capable of linking the people, places, organizations, events, and subject headings in library metadata to real world entities. In some cases existing metadata in both MARC and non-MARC metadata includes references to local or external authorities, but the vast majority of potentially identifiable entities are represented only as strings of characters. Some of our catalog records have been linked to Library of Congress, OCLC (including the VIAF international authority file), or ISNI identifiers through contracts or internal record enhancement projects, and an unrelated project at Harvard has focused on entity recognition within Encoded Archival Description (EAD) collections. A need to extend from authority file links or a registry of named entities to resolvable URIs compatible with linked data has motivated several LD4L investigations, with some focusing on quality and others more on the efficacy of existing services.

...

.

Converting MARC to RDF

For MARC metadata, the team has worked with the Library of Congress BIBFRAME converter as a central component in a workflow that may include pre-processing to address variations in local MARC cataloging practice and in most cases will also require post-processing to produce data ready for consumption and interoperability with other linked data on the Web. While the conversions to BIBFRAME of a range of some 30 record types have been explored in concert with technical services staff at our three libraries, the ontology team has focused primarily on the availability and representation of data pertinent to the LD4L use cases rather than analyzing converter output across the board to ascertain completeness and correctness.

The classes and properties themselves in BIBFRAME, as well as some of their definitions, remain under active discussion on the BIBFRAME mailing list (archives) and in other venues.  With our project's strong focus on linking through to real world entities, we remain flexible in our interpretation and application of the BIBFRAME ontology, in some cases electing to use properties and/or classes in an LD4L namespace until such time as consensus has been reached in later releases or through pilot projects scheduled for 2015 and/or community practice. Fundamental questions will continue about distinctions between information and real world entities and conflicts between a desire to retain all the information encoded in MARC records vs. allowing bibliographic metadata to more freely inter-operate with other Web data.

...