Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The goal of the curation system ('CS') is to provide a simple, extensible, way to manage routine content operations on a repository. These operations are known to CS as 'tasks', and they can operate on any DSpaceObject (i.e. subclasses of DSpaceObject) - although the first incarnation will only understand Communities, Collections, and Items - viz. core data model objects. Tasks may essentially work on only one type of DSpace object - typically an item - and in this case they may simply ignore other data types (tasks have the ability to
'skip' objects for any reason). The DSpace core distribution ought to provide a number of useful tasks, but the system is designed to encourage local extension - tasks can be written for any purpose, and placed in any java package. What sorts of things are appropriate tasks?

...

Code Block
 ./dsrun org.dspace.curateCurationCli \-t vscan \-i 123456789/4 

As with other command-line tools, these invocations could be placed in a cron table and run on a fixed schedule, or
run on demand by an administrator.

...

In the XMLUI, there is a 'Curate' tab (appearing within the 'Edit Community/Collection/Item') that exposes a drop-down list
of configured tasks, with a button to 'perform' the task, or queue it for later operation (see section III below). You may
filter out some of the defined tasks (not appropriate for UI use), by means of a configuration property.

In

...

Workflow

CS provides the ability to attach any number of tasks to standard DSpace workflows. Using a configuration file
(workflow-curation.xml), you can declaratively (without coding) wire tasks to any step in a workflow. An example:

...

use the command-line tool, but we could also read the queue programmatically. Any number of queues can be defined and used as needed.
In the administrative UI curation 'widget', there is the ability to both perform a task, but also place it on a queue for later processing.

Task Output and Reporting

Few assumptions are made by CS about what the 'outcome' of a task may be (if any) - it. could e.g. produce a report to a temporary file. But the CS runtime does provide a few pieces of information that a task can assign:

Status Code

This was mentioned above. This is returned to CS whenever a task is called. In addition to the task-assigned codes, there are values:

Code Block

      NOTASK - CS could not find the requested task
      UNSET  - task did not return a status code because it has not yet run

Result String

The task may define a string indicating details of the outcome. This result is displayed, e.g. in the 'curation widget' described above:

Code Block

       "Virus 12312 detected on Bitstream 4 of 1234567789/3"

CS does not interpret or assign result strings, the task does it.

Reporting Stream

This is not currently fully implemented, just writes to standard out. But if more details should be recorded, they can be pushed to this stream.

All 3 are accessed (or set) by methods on the Curation object:

Code Block

     Curator curator = new Curator();
     curator.addTask("vscan").curate(coll);
     int status = curator.getStatus("vscan");