Create New Persons and Organizations
The transform for organizations, makeUROs.xsl , is much simpler than that of persons and so we shall describe makeURPs.xsl in this section. As mentioned previously, in the case of persons, we have name parts and a uniquely assigned, but possibly missing, source NETID for matching. We also want that URIs be assigned uniquely even when we have an exact name part match for a set of records but NETID is missing in one (or more) records but not in others. Indeed it quite possible for there to be several URPs in our source that have exactly the same name parts, character for character, but different netids. In our example there are at least three (and possibly four) distinct people named 'Arthur R Fuller' and multiple EduRecords that have to be assigned to the correct person. Misattribution should be rare although it is unavoidable in an automated system when there is no sure way to distinguish between different people. Thus when there is no netid associated with an EduRecord we will add a 'weak attribution' triple in the RDF that we generate. Our main purpose here is to create a file of new person records, in the PER0.xml style, that specifies name parts, uri and netid if possible. These will be used in a later step to fill in missing URIs in ED0.xml .
...
- [F11H0] This is the same grouping method employed in the Count Step. The variable cgCounts refers to a set of counts each of which is the number of distinct people sharing a name.
- [F11H1] The call to the recursive named template cumulativeSum produces a sequence containing the 0 based cumulative sums for the counts in cgCounts . These are the offsets into the list of 11 URIs that we need to do URI assignment to the new people. The variable cumulativeCgCounts contains the cumulative sum sequence. The recursive named template can be found in the Appendix D and source file makeURPs.xsl .
...
The files NewPers.xml and NewOrgs.xml are created by applying the transforms makeURPs.xsl and makeUROs.xsl successively.