Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This step is performed by the  gather.xsl  transform which can be found in its entirety the  example/xslt  folder. Figure 5 shows the first portion of this XSLT. We will now comment on the highlighted sections. In what follows, we will use [FnHm] to denote Figure n, Highlight m.

Image Removed
gather.xsl Fragment 1 - Figure 5

...

Next we consider the highlights of the second half of the  gather.xsl  file. This part of the code handles filtering of the source, construction of the output xml and resolving as many URIs as possible by comparing name information against the elements in  Per0.xml  in the case of people and against  Org0.xml  in the case of organizations. Figure 6 shows more of  gather.xsl .

Image Removed
gather.xsl Fragment 2 - Figure 6

  • [F6H0] Establish which source rows will pass the rejection criteria filter and start an  EduRecord .
  • [F6H1] Find any and all matching school URIs among the known organizations. Notice that the variable  school  contains a space normalized copy of the INSTITUTION  and is further shifted to uppercase before comparison with each adjusted organization  name . The  vfx:adjust  function, shown in Figure 7, applies the standard XPATH functions  normalize-space  and  upper-case . The variable  schoolUri  is refers to a sequence of 0 or more matching organization URIs.
  • [F6H2] Collect and create the required XML elements, applying  normalize-space  to fix any white space issues in the source data. You may choose to add other normalizations at this point.
  • [F6H3] If  schoolUri  contains a URI then use the first one; otherwise leave  edSchoolUri  empty. If  schoolUri  has more than one term then there are duplicate entries in  Org0.xml . This is not the case for this example. However steps must be taken to prevent duplicates by properly maintaining organization triples in VIVO.
  • [F6H4] Since we may not find a URI for the school or person, we include that as an empty element along with the name parts and netid of the person who received the degree in the output XML for downstream remediation.
  • [F6H5] In this step we look for a matching person by calling the name matching function  vfx:findMatchingPeople  which will return a URI or the empty string. We will describe this function shortly. Several versions of this function are included in the source code so that the reader can experiment.
  • [F6H6] This shows the end of the  gather.xsl  transform with the inclusion of a file of auxiliary functions that contains the definitions of  vfx:adjust  and vfx:findMatchingPeople  and other functions. The function  vfx:findMatchingPeopleI , not shown here, is much stricter in terms of what can match. It is included in the source code files so that the reader can compare it to the superior alternative  vfx:findMatchingPeople .

Image Removed
auxfuncs.xsl Fragment 1 - Figure 7

...

The person name finding function illustrated in Figure 8 is in some ways just an elaborate version of the simpler organization match XPATH expression described above in Figure Hightlight 1 of Figure 6 Highlight 1. Notice that the name part matching function does not contain the heuristic middle initial weakness described above.

 

Image Removed
auxfuncs.xsl Fragment 2 - Figure 8

...