Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

(See also Jon Corson-Rikert's examples.)

The ingest workflow language was added in August 2008 as a simple way of scripting actions that would otherwise require manual interaction with the Ingest Tools page. At the time it was imagined that sequences of ingest tool actions might be saved as a workflow and edited through the GUI, but this functionality was never implemented.

...

RDF workflow descriptions use the following namespaces:

No Format

 @prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
 @prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
 @prefix w:       <http://vitro.mannlib.cornell.edu/ns/vitro/rdfIngestWorkflow#> .
 @prefix s:       <http://vitro.mannlib.cornell.edu/ns/vitro/0.7/sparql#> .
 @prefix vitro:   <http://vitro.mannlib.cornell.edu/ns/vitro/0.7#> .
 @prefix ex:      <http://example.org/myWorkflow#> .

...

Each workflow begins with a resource representing that workflow itself:

No Format

 ex:MyFirstWorkflow
    a            w:Workflow ;
    rdfs:label   "My First Workflow" ;
    w:firstStep  ex:step1 .

...

A workflow is just a linked list of workflow steps. Each workflow step has a label and a property pointing to the action that is to performed at this step. This arrangement means that a particular action can be defined once and performed at multiple different steps in a workflow.

No Format

 ex:MyFirstWorkflow
    a            w:Workflow ;
    rdfs:label   "My First Workflow" ;
    w:firstStep  ex:step1 .

 ex:step1
    a            w:WorkflowStep ;
    rdfs:label   "Do something first" ;
    w:action     ex:someAction ;
    w:nextStep   ex:step2 .

 ex:step2
    a            w:WorkflowStep ;
    rdfs:label   "Now do something else" ;
    w:action     ex:anotherAction ;
    w:nextStep   ex:step3 .

 ex:step3
    a            w:WorkflowStep ;
    rdfs:label   "Do a third thing" ;
    w:action     ex:aThirdAction .

...

There are eight different types of actions, which use different properties to specify their parameters:

No Format

 w:ClearModelAction
 w:AddModelsAction
 w:SubtractModelsAction
 w:ExecuteSparqlConstructAction
 w:SmushResourcesAction
 w:NameBlankNodesAction
 w:SplitPropertyValuesAction
 w:ProcessPropertyValueStringsAction 

Actions work on one or more RDF models, which must be visible to the ingest tools either in the default database or by using the "Connect DB" page. Models are represented as resources of type w:Model:

No Format

    ex:myWorkingModel
        a           w:Model ;
        rdfs:label  "working model" ;
        w:modelName "working model" .

...

When actions are not reused across multiple workflow steps, it can be convenient to describe them using blank nodes. This avoids having to assign a separate URI to each action, and also allows them to be written inline:

No Format

 crw:CreateFullNetIDs
      a w:WorkflowStep ;
      rdfs:label "Create Cornell NetID property"@en-US ;
      w:action [
        a w:SPARQLCONSTRUCTAction ;
        w:sourceModel crw:cheResponseModel ;
        w:destinationModel crw:cheResponseModel ;
            w:sparqlQuery 
             [ s:queryStr """ PREFIX che: <http://vitro.mannlib.cornell.edu/ns/ingest/CHE#>
                         PREFIX vivo: <http://vivo.library.cornell.edu/ns/0.1#>
                         PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
            CONSTRUCT { 
                ?s vivo:CornellemailnetId ?o
            } WHERE {
                ?s che:cheresponse_Netid ?o
            }""" ]
      ] .

...

parameters: w:sourceModel

example:

No Format

 crw:ClearWorkingModel
      a             w:ClearModelAction ;
      rdfs:label    "clear working model" ;
      w:sourceModel ex:workingModel .

...

Multiple source models may be specified for this action, by adding additional statements using w:sourceModel .

example:

No Format

 crw:CreateFullNetIDs
      a w:WorkflowStep ;
      rdfs:label "Create Cornell NetID property"@en-US ;
      w:action [
        a w:SPARQLCONSTRUCTAction ;
        w:sourceModel crw:cheResponseModel ;
                w:sourceModel crw:anotherModel ;
        w:destinationModel crw:cheResponseModel ;
            w:sparqlQuery 
             [ s:queryStr """ PREFIX che: <http://vitro.mannlib.cornell.edu/ns/ingest/CHE#>
                         PREFIX vivo: <http://vivo.library.cornell.edu/ns/0.1#>
                         PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
            CONSTRUCT { 
                ?s vivo:CornellemailnetId ?o
            } WHERE {
                ?s che:cheresponse_Netid ?o
            }""" ]
      ] .

...

The value of the w:smushOnProperty statement is a literal containing the URI of the property on which to smush, e.g.

No Format

 ex:smushAction
    a                  w:SmushResourcesAction ;
    w:sourceModel      ex:model1 ;
    w:destinationModel ex:model2 ;
    w:smushOnProperty  "http://www.example.org/ontology/employeeID" .

...

The value of the w:uriPrefix statement is a literal containing the namespace for the URIs plus the initial non-numeric portion of the local name.

For example,

No Format

    w:uriPrefix "http://example.org/individual/n"

will cause the action to rename blank nodes with URIs that look like

No Format

    <http://example.org/individual/n23568>
    <http://example.org/individual/n41>
    <http://example.org/individual/n9156662>

...

The value of the w:originalProperty is a literal containing the URI of the property with the delimited values and the value of the w:newProperty is a literal containing the URI of the property to be used for each split value.

example:

No Format

    ex:SplitValues
        a                  w:SplitPropertyValuesAction ;
        w:sourceModel      ex:model1 ;
        w:destinationModel ex:model2 ;
        w:originalProperty "http://example.org/ontology/departmentIDs" ;
        w:newProperty      "http://example.org/ontology/departmentID" ;
        w:splitRegex       "," .

This will transform the following RDF in the source model

No Format

     ex:something
            <http://example.org/ontology/departmentIDs> "CHEM, BIO, BIO-PL, FOODS" .

into the following RDF in the destination model

No Format

     ex:something
            <http://example.org/ontology/departmentID> "CHEM" ;
            <http://example.org/ontology/departmentID> "BIO" ;
            <http://example.org/ontology/departmentID> "BIO-PL" ;
            <http://example.org/ontology/departmentID> "FOODS" .

...

parameters: w:sourceModel, w:destinationModel, w:originalProperty, w:newProperty, w:processorClass, w:processorMethod

example:

No Format

 crw:AppendCornellEdu
    a w:WorkflowStep ;
    rdfs:label "Append @Cornell.edu to net ids"@en-US ;
    w:action [
        a w:ProcessPropertyValueStringsAction ;
        w:sourceModel crw:cheResponseModel ;
        w:destinationModel crw:cheResponseModel ;
        w:originalProperty [
            a w:Literal ;
            w:literalValue "http://vivo.library.cornell.edu/ns/0.1#CornellemailnetId"
            ] ;
        w:newProperty [
            a w:Literal ;
            w:literalValue "http://vivo.library.cornell.edu/ns/0.1#CornellemailnetId"
            ] ;
        w:processorClass [
            a w:Literal ;
            w:literalValue           "edu.cornell.mannlib.vitro.bjl23.ingest.hr.HRCornellEmailProcessor"
            ] ;
        w:processorMethod [
            a w:Literal ;
            w:literalValue "process"
            ]
    ] .