Introduction

Archidora is the Archivematica-Islandora Integration Module. Archivematica provides a preservation system that the Archidora module integrates into Islandora.

It was developed in a partnership between Artefactual Systems and Discovery Garden, sponsored by the University of Saskatchewan Library.

About Archivematica

Archivematica is a free and open-source digital preservation system that is designed to maintain standards-based, long-term access to collections of digital objects. It uses a micro-services design pattern to provide an integrated suite of software tools that allows users to process digital objects from ingest to access in compliance with the ISO-OAIS functional model. Users monitor and control the micro-services via a web-based dashboard. Archivematica uses METS, PREMIS (events, agents, rights and restrictions), Dublin Core, the Library of Congress BagIt specification and other best practice standards and practices to provide trustworthy, authentic, reliable, and interoperable archival packages (AIPs) for storage in your preferred repository.

Archivematica provides several decision points that give the user control over choices about format identification tools, printing the original order of the directories ingested, examining contents for private and personal information, extracting contents of packages and forensic images, transcribing content, and more. Users may also preconfigure most of these options for seamless ingest to archival storage and access. Archivematica offers many ingest workflows: metadata and submission documentation import, zipped and unzipped Bag ingest, digital forensic image processing, SIP arrangement, manual normalization, and dataset management.

You may read more about Archivematica here.

About Archidora

Download

Islandora module: https://github.com/Islandora-Labs/archidora

Archivematica: Archivematica 1.6.1 and Storage Service 0.10.0 or later is recommended; download from http://www.archivematica.org.

This integration is currently (as of 1.6/0.10 release) considered a beta feature. Support for Archivematica and/or the Storage Service running on secure servers (https) will likely require Storage Service 0.11 or later.

Installation

Installation and testing is similar to any Drupal module. Please see Installing the Islandora Enhancement Modules for details.

Configuration

In the Archivematica Storage Space:

Archivematica may also be configured to call back to Islandora to delete the high-res "OBJ" datastreams. 

Note: the OBJ datastreams are not deleted automatically, but rather are listed at the collection level (or compound object level) on the Manage | Archivematica tab. They can be deleted individually or in bulk. Note also that the callback does not currently work on objects whose access is restricted by a XACML policy.

On the Archivematica dashboard:

Storage Service - gunicorn settings

1. Add the line `env SS_GUNICORN_WORKER_CLASS=sync` to the AM SS service config file at /etc/init/archivematica-storage-service.conf.

2. Reload the config and restart the SS service:

    $ sudo initctl reload-configuration
    $ sudo service archivematica-storage-service restart

3. Check the SS logs and expect the last Using worker line to be Using worker: sync and NOT Using worker: gevent:

    $ sudo vi /var/log/upstart/archivematica-storage-service.log

Archivematica automation tools:

In Islandora:

As a side-effect of using Cron Queues, the submission of objects to Archivematica may not complete during any one invocation of Cron. It is also recommended that cron run at reasonably frequent intervals (e.g. every five minutes), otherwise the expected callbacks may not be triggered often enough.

Batch processing

A sample drush script is available to ingest Islandora collections in batch (e.g. for objects created before archidora was deployed on an Islandora instance).

Usage:

sudo drush -u 1 archidora-send-collection-to-archivematica --target=islandora:collection1

or

sudo drush -u 1 asca --target=islandora:collection1

Currently, it is not recursive (but an unmerged pull request adds this functionality). It also ignores the "Don't Archive Children" setting.