You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

(*DRAFT*)

This is a guide for LD4P2 Partners (cohort and PCC affiliates included) on how to request a new dataset for QA so corresponding lookups can be added to Sinopia (the LD4P2 supported BIBFRAME editor).

Email Steven Folsom (sf433 @ cornell dot edu) with related questions or comments on how to improve this guide.

Process for Prioritizing Requests (Coming Soon)

Please note, all requests for new data sources in QA will be prioritized by the LD4P2 project. Due to time restrictions there is no guarantee that all requests will be added to QA during the lifetime of the LD4P2 grant; regardless of resources it is still useful to know which datasets the community would find useful in such a lookup service.

Steps

1.) Make sure the dataset is not already in QA

Go to the Authorities tab on http://elr37-dev.library.cornell.edu/ (a temporary site until http://lookup.ld4l.org/ is live) to see if the dataset of interest is already available in QA. Please, also consult the summary page (NEED LINK to page E. Lynette Rayle is creating) of datasets being supported by the LD4P2 through QA vs. those being supported by type ahead searching already available in the BIBFRAME editor.

2.) Identify the new dataset

Gather information about how to acquire data dumps and/or API access and the dataset's homepage URL.

You will be asked for this information when making a formal request as a GitHub issue, see Step 5 below.

3.) Decide how contextual information should be in the lookup service

As you might know by now, QA has the ability to provide contextual information about an entity during the look-up experience. In order to do so, decisions need to be made about how to index the dataset's RDF. Using this spreadsheet, add a tab for the new dataset. For each new tab, please use the following column headers and value guidelines (N.B. see the existing LCGT tab as an example).

You will be asked to confirm this has been done when making a formal request as a GitHub issue, see Step 5 below. 


EntityTypePropertyPathSearchDisplayRankingNotes
URI for the class of entity in the lookupURI/s for the property or property path to get to the information to be indexed in QA

Use an 'X' to mark if this data should be used to search against.

N.B. some data is important to display to the cataloger, but perhaps would create messy results if searched against in a lookup environment, e.g. some notes are administrative in nature.

Use an X to mark if the value should be displayed. Include a label for the field. The label may simply be the property name in the property path column or you may decide another term is more appropriate.If applicable provide notes on whether a particular property path should weigh heavier on the search rankings than others.


4.) Add Test parameters using a YAML file

In order to make sure the QA search behavior (recall and relevancy) are meeting expectations, QA uses YAML to define test parameters. These parameters include being able to declare for a particular text string searched using QA that the results should include a particular resource (identified by a URI) and what position the resource should be found. For example, when searching 'Casebooks' against LCGFT, http://id.loc.gov/authorities/genreForms/gf2011026115 should be in the top 10 result. 

You will be asked to confirm this has been done when making a formal request as a GitHub issue, see Step 5 below.

(Definition of YAML Keys coming soon.)

    a.) Using the YAML key definitions above, create a YAML file for your dataset using a text editor. Save the file with the file extension .yml. and upload to https://github.com/LD4P/qa_server/tree/master/lib/generators/qa_server/templates/config/authorities/linked_data/scenarios.

    b.) Alternatively, from the same page https://github.com/LD4P/qa_server/tree/master/lib/generators/qa_server/templates/config/authorities/linked_data/scenarios, create the YAML file using the GitHub "Create new" file feature.

5.) Create an issue formally to request the new dataset (this will allow the request to be prioritized and tracked)

    a.) From https://github.com/LD4P/qa_server/issues/new/choose, create an issue by clicking on "Get started" for the Request a New Dataset for QA. You will be asked to provide/confirm the following:

    [ ] Identify the data source: (Include the Data Source Name and its homepage URL)
    [ ] Add a new tab and indexing information for the data source to the following spreadsheet:         https://docs.google.com/spreadsheets/d/1rPvEoP9iYNkxJ0eWC8gXe3ci7e6mhW0da59xkGhadi0/edit?usp=sharing.
    [ ] Add a YAML test file to https://github.com/cul-it/qa_server/tree/master/config/authorities/linked_data/scenarios; please provide here a link to the YAML file related to this request.

  • No labels