Overview

This installation procedure can be used to install the DSR Reproducible Harvest application on a Debian Lenny operating system by a system administrator. This utility must be installed on the same server as the Harvester in order to function. This installation includes instructions for configuring the Harvester to allow for DSR harvesting.

Installation

Prerequisites

  • Debian Lenny Operating System (Linux OS)
  • VIVO 1.2.x installed and configured to use MySQL as a back end
  • MySQL user configured properly for localhost access
  • Harvester 1.2.x Installed and configured to communicate with the VIVO database and a configured DSR Harvest configuration file within the Harvester exists.
  • Mail transfer agent configured for outbound mail
  • People in VIVO, Grants Empty

Harvester Configuration

  • Ensure that /usr/share/vivo/harvester/example-scripts/example-dsr/vivo.model.xml settings match setting in vivo deploy.properties (see VIVO Installation Documentation).
  • nano ../harvester/example-scripts/example-dsr/jdbcfetch.config.xml
  • jdbc:jtds:sqlserver://IPADDRESS:PORT/DATABASENAME
  • USERNAME
  • PASSWORD
  • Note: The example DSR Harvester configuration defines the names of Views that currently exist in the UF DSR database. If these views change, the XSL files would need to be modified in /usr/share/vivo/harvester/example-scripts/example-dsr/dsr-to-vivo.xsl

Download DSR Reproducible Harvest

  • sudo mkdir /usr/share/vivo/vivo-auto-harvest
  • cd /usr/share/vivo/vivo-auto-harvest
  • sudo wget http://downloads.sourceforge.net/project/vivo/VIVO%20Harvester/Example%20Harvest%20Script/dsr.tar.gz
  • sudo tar -zxvf dsr.tar.gz
  • sudo rm -f dsr.tar.gz
  • sudo chmod 775 /usr/share/vivo/vivo-auto-harvest/dsr/bin/dsr.sh (ensures root can execute shells script)

    Configure dsr.sh

  • sudo nano dsr/bin/dsr.sh
  • EMAIL_RECIPIENT = your_email_address@example.com
  • HARVESTER_INSTALL_DIR = “/usr/share/vivo/harvester”
  • Configure Indexing
    • Enter local admin login credentials in the indexing section under ADMINNAME and ADMINPASS for the web interface
    • Enter the Vivo address base for the local installation into VIVOBASE. Be sure to include the trailing /.
  • cntrl-O (writes out the edited script)
  • cntrl-X (exits the nano editor)
  • Note: Ensure Grants have been removed from VIVO. You should be running this harvest for the first time. An important feature of the Harvester is that a preexisting file is saved and utilized in subsequent harvests. Without this file present, it is likely that duplicates will be created. It’s important that you probably have a backup before running this harvest in case there is a problem with the data. Additionally, it is important that people have already been harvested into your VIVO implementation. Without people, there will be no one to link the grants and contracts to that are harvested. Without people, the result would be stubs.

Harvest the data by executing the DSR shell script

  • cd /usr/share/vivo/vivo-auto-harvest/dsr/bin
  • ./dsr.sh
  • Wait for console output to state “End DSR Run” (This may take several hours depending on the size of the harvest)
  • Wait for the email log shortly thereafter
  • Review data in VIVO web application
  • Review Harvester log file in /usr/share/vivo/harvester/logs/dsr.DATETIME.harvester.1.1.x.log

Watch logs

  • tail -f /usr/share/vivo/vivo-auto-harvest/dsr/log/dsr... (append the date of your log file)

    Schedule cron job in root’s crontab

  • Copy example from /usr/share/vivo/vivo-auto-harvest/dsr/bin/example.cron
  • Enter root’s crontab using sudo su -m
  • Paste example cron entry, which should look like:
  • After unpacking, a directory called “dsr” should be created inside the “vivo-auto-harvest” directory. Inside this directory are the following folders and files:
    • Directory Structure
      • ---- bin (folder, storage for scripts to be executed)
      • -------- dsr.sh (file, shell script executed)
      • -------- example.cron (example entry to be placed in root’s crontab)
      • ---- log (folder, storage for log output)
  • No labels