Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: documentation moved

Documentation moved

The documentation and patch installation instructions have been moved to https://github.com/atmire/Elsevier

Table of Contents
minLevel2
outlinetrue
stylenone

Introduction

The contribution referenced at https://jira.duraspace.org/browse/DS-2877 contains search features for the submission system, allowing submitters to search ScienceDirect and retrieve metadata, abstracts, DOI-links and embargo metadata on the article level. This integration is based on the framework from https://jira.duraspace.org/browse/DS-2876.
In addition to the search and retrieval per article, a similar feature is available for a batch import of multiple ScienceDirect records at once.

A new embargo based file upload step has been created. If the item can be found in ScienceDirect, the default value (public, restricted or under embargo) of the permission is determined automatically, but can of course still be adjusted where needed.

Items with a ScienceDirect identifier will contain a link to the ScienceDirect page for the item. If the current user has access to the record in ScienceDirect, this will also be indicated in the URL and displayed to the user (to ensure the user won't need to open the page to verify if they have full-text access).

Items with a ScienceDirect identifier can include a link to an embedded article PDF, downloaded directly from ScienceDirect. This is an optional feature of this program.

All of the above features can of course be enabled or disabled easily, to ensure only the relevant features are displayed.

Registering a Developer API key

The functionality will send requests to ScienceDirect APIs to retrieve metadata. These APIs are protected with user accounts and keys to avoid abuse by robots and malicious users.

To register an API key, go to: https://dev.elsevier.com/apikey/create

Further support for the API key registration process is available from integrationsupport@elsevier.com

More information and policies: http://dev.elsevier.com/ir_cris_vivo.html

Adding the functionality to your DSpace codebase 

Info
titleProposed for inclusion in DSpace 6

Elsevier and Atmire are working together with the DSpace community to add this functionality to the DSpace 6 codebase. As of Oct 28th 2016, this is still work in progress. The latest information can be retrieved on:

Jira
serverDuraSpace JIRA
serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
keyDS-2877

Because the functionality is currently not part of the standard DSpace codebase, the code changes currently have to be added as a customization. To keep these cleanly separated from standard DSpace code, Atmire's patch files are adding these changes in the /dspace/modules/additions directory.

Patching DSpace 5 & DSpace 6

Info
titleUNDER REVISION

The current patch is currently being revised. The new patches for DSpace 5 and 6 will be published on Github in the course of december.

...

  1. Download this patch file onto the machine where you are executing your DSpace maven builds (mvn clean package)
  2. Put the patch file in the root of your DSpace SOURCE directory. So NOT the installation directory where dspace is installed/deployed
  3. OPTIONAL: check if the patch can be applied to your codebase by running "git apply --check --verbose ./elsevier5.4-updated.patch"
  4. Use git-apply to apply the patch to your codebase. "git apply ./elsevier5.4-updated.patch"

After applying these changes to the codebase, follow the standard procedures for building the code with maven and deploying it on your Tomcat. These can be found at Upgrading DSpace#UpgradeSteps from step 4 "building DSpace" and onwards.

Configuration: Enabling or Disabling the features

The following steps assume that you have successfully patched your DSpace or that you are running a version of DSpace in which the ScienceDirect Live import code has already been merged.

Live import

The live import should be enabled in the dspace/config/item-submission.xml. The step below should be included prior to the Describe metadata steps:

Code Block
languagexml
<step>
    <heading>submit.progressbar.liveimport</heading>
    <processing-class>org.dspace.submit.step.LiveImportStep</processing-class>
    <jspui-binding>org.dspace.app.webui.submit.step.JSPStartSubmissionLookupStep</jspui-binding>
    <xmlui-binding>org.dspace.app.xmlui.aspect.submission.submit.LiveImportStep</xmlui-binding>
    <workflow-editable>true</workflow-editable>
</step>

Enabling the metadata fields

The live import requires the following metadata field to be enabled in the input-forms.xml file. This field is disabled by default.

 

Code Block
languagexml
<field>
         <dc-schema>elsevier</dc-schema>
         <dc-element>identifier</dc-element>
         <dc-qualifier>pii</dc-qualifier>
         <repeatable>false</repeatable>
         <label>PII</label>
         <input-type>onebox</input-type>
         <hint>Enter the PII for this item.</hint>
         <required></required>
</field>

 

Enabling the required aspect

The aspect that allows for the metadata to be imported from ScienceDirect needs to be enabled in dspace/config/xmlui.xconf as it is commented out by default.

 

Code Block
languagexml
<aspect name="ScienceDirect" path="resource://aspects/ScienceDirect/" />

 

File permissions in the upload step

The automatic suggestion of the file permissions should be enabled in the dspace/config/item-submission.xml. The step below should replace the default upload step:

Code Block
languagexml
<step>
    <heading>submit.progressbar.upload</heading>
    <processing-class>org.dspace.submit.step.ElsevierUploadStep</processing-class>
    <xmlui-binding>org.dspace.app.xmlui.aspect.submission.submit.ElsevierUploadStep</xmlui-binding>
    <workflow-editable>true</workflow-editable>
</step>

Link to the ScienceDirect page

The display of the link to the ScienceDirect page can be enabled in the theme. The automatic verification whether the user is entitled to download the file can be enabled in dspace/config/modules/elsevier-sciencedirect.cfg using the parameter:

Code Block
elsevier-sciencedirect.entitlement.check.enabled

Embedding the article PDF

The display of the embedded article PDF can be enabled in dspace/config/modules/elsevier-sciencedirect.cfg using the parameter:

Code Block
elsevier-sciencedirect.embed.display

Plugin for the Search API

Configuration

The basic API configuration can be found in file dspace/config/modules/elsevier-sciencedirect.cfg. This file contains the API key (this key will need to be added through the maven profile) and the API urls as well as the extra config for the entitlements check and embedding of the pdf which will be discussed later on

 

Code Block
languagetext
# Api key to be able to make the calls to retrieve the articles, this will need to be requested by the appropriate instance
elsevier-sciencedirect.api.key = ${elsevier.api.key}

# This represents the base url to use for the retrieval of an article
elsevier-sciencedirect.api.article.url=http://api.elsevier.com/content/article
# The base of rest endpoints to represent identifiers and entitlement status associated with requested full text articles
elsevier-sciencedirect.api.entitlement.url=http://api.elsevier.com/content/article/entitlement
# This represents retrieval of a full text article by PII (Publication Item Identifier).
elsevier-sciencedirect.api.pii.url=//api.elsevier.com/content/article/pii/
# The search interfaces associated with ScienceDirect
elsevier-sciencedirect.api.scidir.url=http://api.elsevier.com/content/search/scidir
# Url to base later rest calls on, such as retrieval based on PII etc
elsevier-sciencedirect.ui.article.url=http://www.sciencedirect.com/science/article

Mapping

The file dspace/config/spring/api/scidir-service.xml contains the spring configuration for the beans used by the Elsevier service.

Part of this configuration is the mapping of Science Direct fields to dspace metadata fields.

Configuring the mapping

Each DSpace metadata field that will be used for the mapping must first be configured as a spring bean of class org.dspace.importer.external.metadatamapping.MetadataFieldConfig.

Code Block
languagexml
<bean id="dc.title" class="org.dspace.importer.external.metadatamapping.MetadataFieldConfig">
    <constructor-arg value="dc.title"/>
</bean>

Hereafter this metadata field can be used to create a mapping. To add a mapping for the "dc.title" field declared above, a new spring bean configuration of class org.dspace.importer.external.metadatamapping.contributor.SimpleXpathMetadatumContributor needs to be added. This bean expects 2 property values:

 

  • field: A reference to the configured spring bean of the DSpace metadata field. e.g. the "dc.title" bean declared above.
  • query: The xpath expression used to select the Elsevier value from the XML returned by the Elsevier API. The root for the xpath query is the "entry" element.

 

Code Block
languagexml
<bean id="titleContrib" class="org.dspace.importer.external.metadatamapping.contributor.SimpleXpathMetadatumContributor">
    <property name="field" ref="dc.title"/>
    <property name="query" value="dc:title"/>
</bean>

This is a (shortened) example of the XML of an entry returned by the elsevier API:

Code Block
languagexml
<entry>
    <dc:title>
        Integrating phenotypic small-molecule profiling and human genetics: the next phase in drug discovery
    </dc:title>
    <authors>
        <author>
            <given-name>Cory M.</given-name>
            <surname>Johannessen</surname>
        </author>
    </authors>
</entry>

Because the given-name and surname of an author are contained in one metadata field value in DSpace, multiple Elsevier fields can also be combined into one value. To implement a combined mapping first create a "SimpleXpathMetadatumContributor" as explained above for each part of the field.

Code Block
languagexml
<bean id="lastNameContrib" class="org.dspace.importer.external.metadatamapping.contributor.SimpleXpathMetadatumContributor">
    <property name="field" ref="dc.contributor.author"/>
    <property name="query" value="x:authors/x:author/x:surname"/>
</bean>
<bean id="firstNameContrib" class="org.dspace.importer.external.metadatamapping.contributor.SimpleXpathMetadatumContributor">
    <property name="field" ref="dc.contributor.author"/>
    <property name="query" value="x:authors/x:author/x:given-name"/>
</bean>

Note that for elements without namespace, namespace "x" is appended. This is the default namespace. The namespace configuration can be found in map "FullprefixMapping" in the same spring configuration file.

Then create a new list in the spring configuration containing references to all "SimpleXpathMetadatumContributor" beans that need to be combined.

Code Block
languagexml
<util:list id="combinedauthorList" value-type="org.dspace.importer.external.metadatamapping.contributor.org.dspace.importer.external.metadatamapping.contributor.MetadataContributor" list-class="java.util.LinkedList">
    <ref bean="lastNameContrib"/>
    <ref bean="firstNameContrib"/>
</util:list>

Finally create a spring bean configuration of class org.dspace.importer.external.metadatamapping.contributor.CombinedMetadatumContributor. This bean expects 3 values:

  • field: A reference to the configured spring bean of the DSpace metadata field. e.g. the "dc.title" bean declared above.
  • metadatumContributors: A reference to the list containing all the single Elsevier field mappings that need to be combined. 
  • separator: These characters will be added between each Elsevier field value when they are combined into one field.

 

Code Block
languagexml
<bean id="authorContrib" class="org.dspace.importer.external.metadatamapping.contributor.CombinedMetadatumContributor">
    <property name="separator" value=", "/>
    <property name="metadatumContributors" ref="combinedauthorList"/>
    <property name="field" ref="dc.contributor.author"/>
</bean>

Each contributor must also be added to the "scidirMetadataFieldMap" map in the same spring configuration file. Each entry of this map maps a metadata field bean to a contributor. For the contributors created above this results in the following configuration:

Code Block
languagexml
<util:map id="scidirMetadataFieldMap" key-type="org.dspace.importer.external.metadatamapping.MetadataFieldConfig"
          value-type="org.dspace.importer.external.metadatamapping.contributor.MetadataContributor">
    <entry key-ref="dc.title" value-ref="titleContrib"/>
    <entry key-ref="dc.contributor.author" value-ref="authorContrib"/>
</util:map>

Note that the single field mappings used for the combined author mapping are not added to this list.

Live import

The first submission step is the Elsevier import step. This step allows the user to import a publication from Elsevier.

This step can be skipped by clicking on "Next" at the bottom of the page without importing a publication.

Image Removed

To search for a publication to import fill in at least one of the 4 search fields and click on "Search". A new window will appear containing the search results. To import a publication click on the "Import" button next to it.

Publications that are already imported are shown with a gray background.

Image Removed

When the publication is imported its title and authors are shown at the bottom of the Elsevier import step:

Image Removed

Note that importing a publication removes any fields that were already added to the item.

Configuration

To enable the Elsevier import step add the step to the submission-process in dspace/config/item-submission.xml.

Code Block
languagexml
<step>
    <heading>submit.progressbar.liveimport</heading>
    <processing-class>org.dspace.submit.step.LiveImportStep</processing-class>
    <jspui-binding>org.dspace.app.webui.submit.step.JSPStartSubmissionLookupStep</jspui-binding>
    <xmlui-binding>org.dspace.app.xmlui.aspect.submission.submit.LiveImportStep</xmlui-binding>
    <workflow-editable>true</workflow-editable>
</step>

Batch import

Import multiple publications from Elsevier using the batch import.

The batch import page can be found by clicking on "Elsevier Import" in the administrative menu, or by browsing to {dspace-url}/liveimport.

Start by filling in at least one of the 4 search fields to query the Elsevier API for publications, then click on "Search".

Image Removed

A list of the publications returned by the Elsevier API will be shown. Next to each publication is a checkbox which can be clicked to select the publication for import. Under the publications list a counter shows how many publications are already selected for import. This counter is updated each time the user browses through the publications.

Image Removed

When "Next" is clicked the user is taken to the import page. At the top of this page all publications that are selected for import are listed.

One of the "Select action" options must be chosen to specify what will happen to the imported items:

  • Send imported items to workspace: The items are added to the users "Unfinished submissions" on the submission page.
  • Send imported items to workflow: The items are added to the workflow to be reviewed by the reviewers of the collection the item is added to. 
  • Archive imported items: The items are archived immediately.

A collection to which the items are added must be selected from the "Select collection" dropdown. Click on "Import" to start the import.

Image Removed

File Upload Step

The file upload step has been altered to allow people to select the accessibility of files, it can be restricted from users, placed under embargo so it's not available until a specified date, or simply be made regularly available.

Image Removed

If you encounter the warning message below in the DSpace logs, please verify whether the permissions of your API key are sufficient to retrieving the hosting permissions:

Error retrieving required nodes from the response, please verify whether your ScienceDirect API key has sufficient permissions: <service-error>    <status>                <statusCode>AUTHORIZATION_ERROR</statusCode>        <statusText>APIKey XXXX with IP address X.X.X.X is unrecognized or has insufficient privileges for access to this resource</statusText>       </status></service-error>

Plugin for the Entitlements check

The check for entitlement can be configured in the ${dspace.dir}/config/modules/elsevier-sciencedirect.cfg.

 

Code Block
# Api key to be able to make the calls to retrieve the articles, this will need to be requested by the appropriate instance
elsevier-sciencedirect.api.key = ${elsevier.api.key}

# This represents the base url to use for the retrieval of an article
elsevier-sciencedirect.api.article.url=http://api.elsevier.com/content/article
# The base of rest endpoints to represent identifiers and entitlement status associated with requested full text articles
elsevier-sciencedirect.api.entitlement.url=http://api.elsevier.com/content/article/entitlement
# This represents retrieval of a full text article by PII (Publication Item Identifier).
elsevier-sciencedirect.api.pii.url=//api.elsevier.com/content/article/pii/
# The search interfaces associated with ScienceDirect
elsevier-sciencedirect.api.scidir.url=http://api.elsevier.com/content/search/scidir
# Url to base later rest calls on, such as retrieval based on PII etc
elsevier-sciencedirect.ui.article.url=http://www.sciencedirect.com/science/article
# Check statuses associated with the requested articles
elsevier-sciencedirect.entitlement.check.enabled=false

# Mapping between retrieved and saved metadata
elsevier-sciencedirect.metadata.field.pii = elsevier.identifier.pii
elsevier-sciencedirect.metadata.field.doi = dc.identifier

This check is used to determine if the user should be allowed to view the published version of the document.

If the document is accessible then the user will be able to view it embedded in the browser (see "Embedding of the PDFs").

Image Removed

If the document is not accessible then the user will be redirected to the document page on elsevier so he can view what the access options are.

Image Removed

* If the DOI starts with 'DOI:', then that part will be parsed off.

Embedding of the PDFs

The PDF is collected via the pii and shown in an embedded reader.

Can be defined in the ${dspace.dir}/config/modules/elsevier-sciencedirect.cfg:

Code Block
# Whether or not to embed the display + its respective width and height
elsevier-sciencedirect.embed.display=true
elsevier-sciencedirect.embed.display.width=700
elsevier-sciencedirect.embed.display.height=500

If width or height are set to '0', they default to the values shown here (700 and 500 respectively).

Apart from the settings mentioned above, the embed also requires the api key from the entitlement config (mentioned above).

A link is shown on the item page which links to a page with an embedded PDF viewer (from the browser).

Image Removed

For Mirage based themes the position of the link to the publisher version of the document can be configured in ${dspace.dir}/config/modules/elsevier-sciencedirect.cfg.

Set "embed.link.position" to "top" to render the link above the file section, or set it to "bottom" to render the link under the file section.

Code Block
# Define if the link to the embed display should be rendered above (top) or under (bottom) the file section on the item page.
# Only supported by theme Mirage
elsevier-sciencedirect.embed.link.position = top

Batch Elsevier items update script

Introduction

To accommodate for changes of previously imported items, an update scripts has been created.
This scripts enables the possibility to use a PII or DOI to re-check an item for possible updates in file permissions, Identifiers and Metadata. This check is done against the originally used Search API and uses the same configuration. 

Usage

The script can be run using the following command in the dspace installation directory.

Code Block
bin/dspace dsrun org.dspace.importer.external.scidir.UpdateElsevierItems (options)

...

  • -t: test -> Only test the changes done by the script if this option is given.
  • -f: forcefully update the requested type of data
  • -a: assign pii:
    • - If the PII exists (and not testing), update the item metadata to include the PII
    • If “-f” is enabled, also verify the items with a DOI and a PII
      • If a PII exists in the API, and differs from the current PII, update the PII (if not testing)
      • If a PII exists in the AP, and is identical to the current PII, leave it unchanged
      • If the PII doesn’t exist in the API, but the current metadata contains a PII, remove it (if not testing)
  • -p: update permissions
    • If not testing, and this type of data should be updated, adjust the file permissions applied to the item if:
      • “-f” is enabled
      • or the permissions were not manually overruled during the submission or workflow
  • -m: import metadata
    • If not forced, this task won’t do anything
    • If “-f” is enabled
      • update the metadata fields retrieved from the API, leaving the other metadata fields unchanged (based on the current configuration of metadata fields)
      • Don’t update any metadata fields if they are all correct
      • Keep in mind that any manual additions to the configured metadata fields will be overruled
  • -i: item handle
    • If this option is given, only run the script for this specific item
    • It this option is omitted, run the script for all archived items

...

An example of the script would be to forcefully test all updates on metadata,pii/doi,permissions for the item with handle 123456789/99

Code Block
./dspace dsrun org.dspace.importer.external.scidir.UpdateElsevierItems -t -p -a -m -f -i 123456789/99

...

Code Block
permission of bitstream with id 59d3ccc7-3ce4-472f-98b5-f74b68fef9d8 would be updated to audience Public start date 2016-06-10
permission of bitstream with id d94f0a08-b1b6-4354-897a-afe1f14375e2 would be updated to audience Public start date 2016-06-10
pii for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f would be removed
metadata dc.identifier would be updated with value DOI:10.1016/j.jnutbio.2015.02.010 for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f
metadata elsevier.identifier.eid would be updated with value 1-s2.0-S0955286315000716 for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f
metadata dc.identifier would be updated with value 8 for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f
metadata dc.format.extent would be updated with value 817 for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f
metadata dc.date.available would be updated with value August 2015 for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f
metadata dc.rights would be updated with value for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f
metadata dc.type would be updated with value Research Article for item with id 2fea9fff-90b0-46f7-9429-064bf35d173f

If these changes are to be applied to the item, the -t option can be left out and the item would then be actually updated