The Department of Sponsored Research (DSR) harvest retrieves information from the relational database view and translates it into meaningful RDF which is then transferred into the Vivo model.

The Process

Extracting

The first part of the process of creating the DSR harvest is to get a view of the desired data. It was data from several tables which was made available in the form of two tables.

One table provided some key information about the grants, the other was centered on the principle investigators (PI).

Transforming

A mapping of data was proposed by the library, taking into account the available ontology. This mapping was then expressed using XSLT. Since the initial harvest and the desired RDF/XML are both XML, the XSL transformations seem most appropriate.
The details of that transformations had to be clear and distinct. The XSL used handles data from either of the tables. Which table the data came from is save in the data. Based on the ontology a series of rdf:descriptions are created and properly linked.

Translation

Grants

Grants are mapped to a core type "Grant". Even though the reasoner should then infer that it is a "Relationship" and an "Agreement" the mapping has entries to set those.

PeopleSoft has an identifier in the form of a Contract Number, the identification is extended into the translation.

People

People are identified with their UFID in order to create the association between the PIs, Co-PIs and their grants.

PIs and Co-PIs are created as stubs with similar inferences as the grants, though they are attributed with "InvestigatorRole" and "ResearcherRole"

Scoring

The scoring for DSR data is fairly simple since it is internal. The only algorithm used is the EqualityTest

Lessons

  • If two objects are meant to be linked and the translation file lacks the links for both generated objects then they will not be linked properly.
  • Templates are useful for dealing with multiple related tables.