You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

(*DRAFT*)

This is a guide for LD4P2 Partners (cohort and PCC affiliates included) on how to request a new dataset for QA so corresponding lookups can be added to Sinopia (the LD4P2 supported BIBFRAME editor).

Email Steven Folsom (sf433 @ cornell dot edu) with related questions or comments on how to improve this guide.

Process for Prioritizing Requests (Coming Soon)

Please note, all requests for new data sources in QA will be prioritized by the LD4P2 project. Due to time restrictions there is no guarantee that all requests will be added to QA during the lifetime of the LD4P2 grant; regardless of resources it is still useful to know which datasets the community would find useful in such a lookup service.

Creating a request in the form of a GitHub issue (see Step 5) will allow the request to be prioritized and tracked.

Steps

1.) Make sure the dataset is not already in QA

Please consult the LD4P2 QA Authority Support Plan to confirm the dataset isn't already being supported through QA, or not currently being considered for QA because the dataset is already supported by type ahead searching available in the BIBFRAME editor.

2.) Identify the new dataset

Gather information about how to acquire data dumps and/or API access and the dataset's homepage URL.

N.B. You will be asked for this information when making a formal request as a GitHub issue, see Step 5 below.

3.) Decide how contextual information should be used

As you might know by now, QA has the ability to provide contextual information about an entity during the look-up experience. In order to do so, decisions need to be made about how to index the RDF descriptions of entities in the dataset.

    a.) Using this spreadsheet, add a tab for the new dataset. For each new tab, please use the following column headers and value guidelines (See the existing LCGT tab in the spreadsheet as an example).

N.B. You will be asked to confirm this has been done when making a formal request as a GitHub issue, see Step 5 below. 


EntityTypePropertyPathSearchDisplayRankingNotes
URI for the class of entity in the lookupURI/s for the property or property path to get to the information to be indexed in QA

Use an 'X' to mark if this data should be used to search against.

N.B. some data is important to display to the cataloger, but perhaps would create messy results if searched against in a lookup environment, e.g. some notes are administrative in nature.

Use an X to mark if the value should be displayed. Include a label for the field. The label may simply be the property name in the property path column or you may decide another term is more appropriate.If applicable provide notes on whether a particular property path should weigh heavier on the search rankings than others.


4.) Add Test parameters using a YAML file

In order to make sure the QA search behavior (recall and relevancy) are meeting expectations, QA uses YAML to define test parameters. These parameters include the ability to declare for a particular text string searched, the results should include a particular resource (identified by a URI) and what is the maximum position in the results the resource should be found.

For example, when searching 'Casebooks' against LCGFT, http://id.loc.gov/authorities/genreForms/gf2011026115 should be in the top 10 result. 

N.B. You will be asked to confirm this has been done when making a formal request as a GitHub issue, see Step 5 below.


    a.) Using the Accuracy Test portion of Writing Tests for an Authority, create a YAML file for your dataset using a text editor. Follow the Save the file with the file extension .yml and upload to https://github.com/LD4P/qa_server/tree/master/lib/generators/qa_server/templates/config/authorities/linked_data/scenarios.


    b.) Alternatively, from the same page https://github.com/LD4P/qa_server/tree/master/lib/generators/qa_server/templates/config/authorities/linked_data/scenarios, create the YAML file using the GitHub "Create new" file feature.

5.) Create an issue to formally request the new dataset

    a.) From https://github.com/LD4P/qa_server/issues/new/choose, create an issue by clicking on "Get started" for the Request a New Dataset for QA. You will be asked to provide/confirm the following:

    [ ] Identify the data source: (Include the Data Source Name and its homepage URL)
    [ ] Add a new tab and indexing information for the data source to the following spreadsheet: https://docs.google.com/spreadsheets/d/1rPvEoP9iYNkxJ0eWC8gXe3ci7e6mhW0da59xkGhadi0/edit?usp=sharing.
    [ ] Add a YAML test file to https://github.com/cul-it/qa_server/tree/master/config/authorities/linked_data/scenarios; please provide here a link to the YAML file related to this request.

  • No labels