Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Update for Commons Config

...

For CS to run a task, the code for the task must of course be included with other deployed code (to [dspace]/lib, WAR, etc) but it must also be declared and given a name. This is done via a configuration property in [dspace]/config/modules/curate.cfg as follows:

Code Block
### Task Class implementations
plugin.named.org.dspace.curate.CurationTask = \
org.dspace.ctask.general.NoOpCurationTask = noop, \

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ProfileFormats = profileformats, \

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.RequiredMetadata = requiredmetadata, \

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ClamScan = vscan, \

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MicrosoftTranslator = translate, \

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MetadataValueLinkChecker = checklinks

...

Each of the above pages exposes a drop-down list of configured tasks, with a button to 'perform' the task, or queue it for later operation (see section below). Not all activated tasks need appear in the Curate tab - you filter them by means of a configuration property. This property also permits you to assign to the task a more user-friendly name than the PluginManager taskname. The property resides in [dspace]/config/modules/curate.cfg:

Code Block
curate.ui.tasknames = \
     profileformats = Profile Bitstream Formats, \
     
curate.ui.tasknames = requiredmetadata = Check for Required Metadata

When a task is selected from the drop-down list and performed, the tab displays both a phrase interpreting the "status code" of the task execution, and the "result" message if any has been defined. When the task has been queued, an acknowledgement appears instead. You may configure the words used for status codes in curate.cfg (for clarity, language localization, etc):

Code Block
curate.ui.statusmessages = \
     -3 = Unknown Task, \
    
curate.ui.statusmessages = -2 = No Status Set, \
    
curate.ui.statusmessages = -1 = Error, \
    
curate.ui.statusmessages = 0 = Success, \
     
curate.ui.statusmessages = 1 = Fail, \
     
curate.ui.statusmessages = 2 = Skip, \
     
curate.ui.statusmessages = other = Invalid Status

As the number of tasks configured for a system grows, a simple drop-down list of all tasks may become too cluttered or large. DSpace 1.8+ provides a way to address this issue, known as task groups. A task group is a simple collection of tasks that the Admin UI will display in a separate drop-down list. You may define as many or as few groups as you please. If no groups are defined, then all tasks that are listed in the ui.tasknames property will appear in a single drop-down list. If at least one group is defined, then the admin UI will display two drop-down lists. The first is the list of task groups, and the second is the list of task names associated with the selected group. A few key points to keep in mind when setting up task groups:

...

Code Block
# ui.taskgroups contains the list of defined groups, together with a pretty name for UI display
curate.ui.taskgroups = \
  replication = Backup and Restoration Tasks, \
 
curate.ui.taskgroups = integrity = Metadata Integrity Tasks, \
  .....
# each group membership list is a separate property, whose value is comma-separated list of logical task names
curate.ui.taskgroup.integrity = profileformats, requiredmetadata
....

...

Code Block
languagejava
Curator curator = new Curator();
     curator.addTask("vscan").queue(context, "monthly", "123456789/4");

...

Code Block
languagejava
host = configurationService.getProperty("clamav.service.host");

and similar. But tasks are supposed to be written by anyone in the community and shared around (without prior coordination), so if another task uses the same configuration file name, there is a name collision here that can't be easily fixed, since the reference is hard-coded in each task. In this case, if we wanted to use both at a given site, we would have to alter the source of one of them - which introduces needless code localization and maintenance.

...

Code Block
languagejava
host = taskProperty("clamav.service.host");

Note that there is no name of the configuration file even mentioned, just the property name whose value we want. At runtime, the curation system resolves this call to a configuration file, and it uses the name the task has been configured as as the name of the config file. So, for example, if both were installed (in curate.cfg) as:

Code Block
org.dspace.ctask.general.ClamAv = vscan,
org.community.ctask.ConflictTask = virusscan,
....

then "taskProperty()" will resolve to [dspace]/config/modules/vscan.cfg when called from ClamAv task, but [dspace]/config/modules/virusscan.cfg when called from ConflictTask's code. Note that the "vscan" etc are locally assigned names, so we can always prevent the "collisions" mentioned, and we make the tasks much more portable, since we remove the "hard-coding" of config names.

...

Code Block
org.dspace.ctask.general.ThumbnailTask = thumbnail,
org.dspace.ctask.general.ThumbnailTask = thumbnail.force

...

Support for scripted tasks does not include any DSpace pre-installation of the scripting language itself - this must be done according to the instructions provided by the language maintainers, and typically only requires a few additional jars on the DSpace classpath. Once one or more languages have been installed into the DSpace deployment, task support is fairly straightforward. One new property must be defined in [dspace]/config/modules/curate.cfg:

Code Block
curate.script.dir = ${dspace.dir}/scripts

...

The third part (here 'dc.publisher') is simply the name of the metadata field to be updated. These two mandatory properties (template and datamap) are sufficient to describe a large number of web services. All that is required to enable this task is to edit 'config/modules/curate.cfg' (or your local.cfg), and add 'issn2pubname' to the list of tasks:

Code Block
plugin.named.org.dspace.curate.CurationTask = \
... other defined tasks
org.dspace.ctask.general.MetadataWebService = issn2pubname, \
... other metadatata web service tasks

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MetadataWebService = doi2crossref, \

If you wish the task to be available in the Admin UI, see the Invocation from the Admin UI documentation (above) about how to configure it. The remaining sections describe some more specialized needs using the MetadataWebService task.

...

In [dspace]/config/modules/curate.cfg, activate the task:

  • Add the plugin to the comma separated list of curation tasks.
Code Block
### Task Class implementations
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.NoOpCurationTask = \noop
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ProfileFormats = profileformats, \

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.RequiredMetadata = requiredmetadata, \

# This is the ClamAV scanner plugin
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ClamScan = vscan = vscan
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MicrosoftTranslator = translate
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MetadataValueLinkChecker = checklinks
  • Optionally, add the vscan friendly name to the configuration to enable it in the administrative it in the administrative user interface.
Code Block
curate.ui.tasknames = \
profileformats = Profile Bitstream Formats, \

curate.ui.tasknames = requiredmetadata = Check for Required Metadata, \

curate.ui.tasknames = checklinks = Check Links in Metadata
# Enable ClamAV from UI
curate.ui.tasknames = vscan = Virus Scan for Viruses
  • In [dspace]/config/modules, edit configuration file clamav.cfg:
Code Block
clamav.service.host = 127.0.0.1
# Change if not running on the same host as your DSpace installation.
clamav.service.port = 3310
# Change if not using standard ClamAV port
clamav.socket.timeout = 120
# Change if longer timeout needed
clamav.scan.failfast = false
# Change only if items have large numbers of bitstreams
  • Finally, if desired virus scanning can be enabled as part of the submission process upload file step. In [dspace]/config/modules, edit configuration file submission-curation.cfg:
Code Block
submission-curation.virus-scan = true

Task Operation from the Administrative user interface

...

If desired virus scanning can be enabled as part of the submission process upload file step. In [dspace]/config/modules, edit configuration file submission-curation.cfg:

Code Block
submission-curation.virus-scan = true

Task Operation from the curation command line client

...

An example configuration file can be found in [dspace]/config/modules/translator.cfg.

Code Block
#---------------------------------------------------------------#
#----------TRANSLATOR CURATION TASK CONFIGURATIONS--------------#
#---------------------------------------------------------------#
# Configuration properties used solely by MicrosoftTranslator   #
# Curation Task (uses Microsoft Translation API v2)             #
#---------------------------------------------------------------#
## Translation field settings
##
## Authoritative language field
## This will be read to determine the original language an item was submitted in
## Default: dc.language

translatetranslator.field.language = dc.language

## Metadata fields you wish to have translated
#
translatetranslator.field.targets = dc.description.abstract, dc.title, dc.type

## Translation language settings
##
## If the language field configured in translate.field.language is not present
## in the record, set translate.language.default to a default source language
## or leave blank to use autodetection
#
translatetranslator.language.default = en

## Target languages for translation
#
translatetranslator.language.targets = de, fr

## Translation API settings
##
## Your Bing API v2 key and/or Google "Simple API Access" Key
## (note to Google users: your v1 API key will not work with Translate v2,
## you will need to visit https://code.google.com/apis/console and activate
## a Simple API Access key)
##
## You do not need to enter a key for both services.
#
translatetranslator.api.key.microsoft = YOUR_MICROSOFT_API_KEY_GOES_HERE
translatetranslator.api.key.google = YOUR_GOOGLE_API_KEY_GOES_HERE