All Versions
- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
Table of Contents | ||||||
---|---|---|---|---|---|---|
|
...
| Traditional Backup & Restore (Database and Files) | AIP Backup & Restore | ||
---|---|---|---|---|
Supported Backup/Restore Types |
|
| ||
Can Backup & Restore all DSpace Content easily | Yes (Requires two backups/restores – one for Database and one for Files) | Yes (Though, will not backup/restore items which are not officially "in archive") | ||
Can Backup & Restore a Single Community/Collection/Item easily | No (It is possible, but requires a strong understanding of DSpace database structure & folder organization in order to only backup & restore metadata/files belonging to that single object) | Yes | ||
Backups can be used to move one or more Community/Collection/Items to another DSpace system easily. | No (Again, it is possible, but requires a strong understanding of DSpace database structure & folder organization in order to only move metadata/files belonging to that object) | Yes | ||
Supported Object Types During Backup & Restore |
|
| ||
Supports backup/restore of all Communities/Collections/Items (including metadata, files, logos, etc.) | Yes | Yes | ||
Supports backup/restore of all People/Groups/Permissions | Yes | Yes | ||
Supports backup/restore of all Collection-specific Item Templates | Yes | Yes | ||
Supports backup/restore of all Collection Harvesting settings (only for Collections which pull in all Items via OAI-PMH or OAI-ORE) | Yes | No (This is a known issue. All previously harvested Items will be restored, but the OAI-PMH/OAI-ORE harvesting settings will be lost during the restore process.) | ||
Supports backup/restore of all Withdrawn (but not deleted) Items | Yes | Yes | ||
Supports backup/restore of Item Mappings between Collections | Yes | Yes (During restore, the AIP Ingester may throw a false "Could not find a parent DSpaceObject" error (see Common Issues or Error Messages), if it tries to restore an Item Mapping to a Collection that it hasn't yet restored. But this error can be safely bypassed using the 'skipIfParentMissing' flag (see Additional Packager Options for more details). | ||
Supports backup/restore of all in-process, uncompleted Submissions (or those currently in an approval workflow) | Yes | No (AIPs are only generated for objects which are completed and considered "in archive") | ||
Supports backup/restore of Items using custom Metadata Schemas & Fields | Yes | Yes (Custom Metadata Fields will be automatically recreated. Custom Metadata Schemas must be manually created first, in order for DSpace to be able to recreate custom fields belonging to that schema. See Common Issues or Error Messages for more details.) | ||
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="46c87dbb-516a-477c-a196-7245387c819f"><ac:plain-text-body><![CDATA[ | Supports backup/restore of all local DSpace Configurations and Customizations | Yes (if you backup your entire DSpace directory as part of backing up your files) | Not by default (unless your also backup parts of your DSpace directory – note, you wouldn't need to backup the '[dspace]/assetstore' folder again, as those files are already included in AIPs) | ]]></ac:plain-text-body></ac:structured-macro> |
Based on your local institutions needs, you will want to choose the backup & restore process which is most appropriate to you. You may also find it beneficial to use both types of backups on different time schedules, in order to keep to a minimum the likelihood of losing your DSpace installation settings or its contents. For example, you may choose to perform a Traditional Backup once per week (to backup your local system configurations and customizations) and an AIP Backup on a daily basis. Alternatively, you may choose to perform daily Traditional Backups and only use the AIP Backup as a "permanent archives" option (perhaps performed on a weekly or monthly basis).
Note | ||
---|---|---|
| ||
If you choose to use the AIP Backup and Restore option, do not forget to also backup your local DSpace configurations and customizations. Depending on how you manage your own local DSpace, these configurations and customizations are likely in one or more of the following locations: unmigrated-wiki-markup
|
...
Note | ||
---|---|---|
| ||
This option allows you to essentially use an AIP as a SIP (Submission Information Package). The default settings will create a new DSpace object (with a new handle and a new parent object, if specified) from your AIP. |
To ingest a single AIP and create a new DSpace object under a parent of your choice, specify the -p
(or --parent
) package parameter to the command. Also, note that you are running the packager
in -s
(submit) mode.
NOTE: This only ingests the single AIP specified. It does not ingest all children objects.
Code Block |
---|
[dspace]/bin/dspace packager -s -t AIP -e <eperson> -p <parent-handle> <file-path>
|
If you leave out the -p
parameter, the AIP package ingester will attempt to install the AIP under the same parent it had before. As you are also specifying the -s
(submit) parameter, the packager
will assume you want a new Handle to be assigned (as you are effectively specifying that you are submitting a new object). If you want the object to retain the Handle specified in the AIP, you can specify the -o ignoreHandle=false
option to force the packager to not ignore the Handle specified in the AIP.
The Submission mode (-s
) always creates a new object with a newly assigned handle. In addition by default it respects all existing Collection approval workflows (so items may require approval unless the workflow is skipped by using the -w
option). For information about how the "Submission Mode" differs from the "Replace / Restore mode", see The difference between "Submit" and "Restore/Replace" modes above.
Note | ||
---|---|---|
| ||
This option allows you to essentially use an AIP as a SIP (Submission Information Package). The default settings will create a new DSpace object ( | ||
Note | ||
| ||
This option allows you to essentially use a set of AIPs as SIPs (Submission Information Packages). The default settings will create a new DSpace object (with a new handle and a new parent object, if specified) from each your AIP. |
To ingest an AIP hierarchy from a directory of AIPs, use the -a
a single AIP and create a new DSpace object under a parent of your choice, specify the -p
(or --allparent
) package parameter .For example, use this 'packager' command template:to the command. Also, note that you are running the packager
in -s
(submit) mode.
NOTE: This only ingests the single AIP specified. It does not ingest all children objects.
Code Block |
---|
[dspace]/bin/dspace packager -s -a -t AIP -e <eperson> -p <parent-handle> <file-path>
|
for example:
Code Block |
---|
[dspace]/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/12 aip4567.zip
|
The above command will ingest the package named "aip4567.zip" as a child of the specified Parent Object (handle="4321/12"). The resulting object is assigned a new Handle (since -s
is specified). In addition, any child AIPs referenced by "aip4567.zip" are also recursively ingested (a new Handle is also assigned for each child AIP).
If you leave out the -p
parameter, the AIP package ingester will attempt to install the AIP under the same parent it had before. As you are also specifying the -s
(submit) parameter, the packager
will assume you want a new Handle to be assigned (as you are effectively specifying that you are submitting a new object). If you want the object to retain the Handle specified in the AIP, you can specify the -o ignoreHandle=false
option to force the packager to not ignore the Handle specified in the AIP.
Note | ||
---|---|---|
| ||
This option allows you to essentially use a set of AIPs as SIPs (Submission Information Packages). The default settings will create a new DSpace object (with a new handle and a new parent object, if specified) from each AIP |
To ingest an AIP hierarchy from a directory of AIPs, use the -a
(or --all
) package parameter.
For example, use this 'packager' command template:
Code Block |
---|
[dspace]/bin/dspace packager -s -a -t AIP -e <eperson> -p <parent-handle> <file-path>
|
for exampleAnother example – Ingesting a Top-Level Community (by using the Site Handle, <site-handle-prefix>/0
):
Code Block |
---|
[dspace]/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/012 community-aipaip4567.zip |
The above command will ingest the package named "community-aipaip4567.zip" as a top-level community (i.e. the specified parent is child of the specified Parent Object (handle="4321/012" which is a Site Handle). Again, the The resulting object is assigned a new Handle (since -s
is specified). In addition, any child AIPs referenced by "community-aipaip4567.zip" are also recursively ingested (a new Handle is also assigned for each child AIP)assigned for each child AIP).
Another example – Ingesting a Top-Level Community (by using the Site Handle, <site-handle-prefix>/0
):
Code Block |
---|
[dspace]/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/0 community-aip.zip
|
The above command will ingest the package named "community-aip.zip" as a top-level community (i.e. the specified parent is "4321/0" which is a Site Handle). Again, the resulting object is assigned a new Handle. In addition, any child AIPs referenced by "community-aip.zip" are also recursively ingested (a new Handle is also assigned for each child AIP).
Warning | ||
---|---|---|
| ||
Please note: If you are submitting a larger amount of content (e.g. multiple Communities/Collections) to your DSpace, you may want to tell the 'packager' command to skip over any existing Collection approval workflows by using the
|
Warning | ||
---|---|---|
| ||
Please note, if you are using AIPs to move an entire Community or Collection from one DSpace to another, there is a known issue (see DS-1105) that the new DSpace instance will be unable to (re-)create any DSpace Groups or EPeople which are referenced by a Community or Collection AIP. The reason is that the Community or Collection AIP itself doesn't contain enough information to create those Groups or EPeople (rather that info is stored in the SITE AIP, for usage during Full Site Restores).
|
By default, the Submission mode (-s
) always respects existing Colleciton approval workflows. So, if a Collection has a workflow, then a newly submitted Item will be placed into that workflow process (rather than immediately appearing in DSpace).
However, if you'd like to skip all workflow approval processes you can use the -w
flag to do so. For example, the following command will skip any Collection approval workflows and immediately add the Item to a Collection.
Code Block |
---|
[dspace]/bin/dspace packager -s -w -t AIP -e <eperson> -p <parent-handle> <file-path> |
This -w
flag may also be used when Submitting an AIP Hierarchy. For example, if you are migrating one or more Collections/Communities from one DSpace to another, you may choose to submit those AIPs with the -w
option enabled. This will ensure that, if a Collection has a workflow approval process enabled, all its Items are available immediately rather than being all placed into the workflow approval process.
Restoring is slightly different than just submitting. When restoring, we make every attempt to restore the object as it used to be (including its handle, parent object, etc.). For more information about how the "Replace/Restore Mode" differs from the "Submit mode", see The difference between "Submit" and "Restore/Replace" modes above.
There are currently three restore modes:
...
Note | ||
---|---|---|
| ||
|
Info | ||
---|---|---|
| ||
|
...
When the "Force Replace" flag (-f
option) is specified, the restore will overwrite any objects found to already exist in DSpace. In other words, existing content is deleted and then replaced by the contents of the AIP(s) any objects found to already exist in DSpace. In other words, existing content is deleted and then replaced by the contents of the AIP(s).
Info | ||
---|---|---|
| ||
This mode may also be used to restore missing objects which refer to existing objects. For example, if you are restoring a missing Collection which had existing Items linked to it, you can use this mode to auto-restore the Collection and update those existing Items so that they again link back to the newly restored Collection. |
Warning | ||
---|---|---|
| ||
Because this mode actually destroys existing content in DSpace, it is potentially dangerous and may result in data loss! You may wish to perform a secondary full backup (assetstore files & database) before attempting to replace any existing object(s) in DSpace. |
...
Note | ||
---|---|---|
| ||
|
...
Option | Ingest or Export | Default Value | Description |
---|---|---|---|
| ingest-only | true | Tells the AIP ingester to automatically create any metadata fields which are found to be missing from the DSpace Metadata Registry. When 'true', this means as each AIP is ingested, new fields may be added to the DSpace Metadata Registry if they don't already exist. When 'false', an AIP ingest will fail if it encounters a metadata field that doesn't exist in the DSpace Metadata Registry. (NOTE: This will not create missing DSpace Metadata Schemas. If a schema is found to be missing, the ingest will always fail.) |
| export-only | defaults to exporting all Bundles | This option can be used to limit the Bundles which are exported to AIPs for each DSpace Item. By default, all file Bundles will be exported into Item AIPs. You could use this option to limit the size of AIPs by only exporting certain Bundles. WARNING: any bundles not included in AIPs will obviously be unable to be restored. This option can be run in two ways:
|
| ingest-only | Restore/Replace Mode defaults to 'false', | If 'true', the AIP ingester will ignore any Handle specified in the AIP itself, and instead create a new Handle during the ingest process (this is the default when running in Submit mode, using the |
| ingest-only | Restore/Replace Mode defaults to 'false', | If 'true', the AIP ingester will ignore any Parent object specified in the AIP itself, and instead ingest under a new Parent object (this is the default when running in Submit mode, using the |
| export-only | defaults to "all" | This option can be used to limit the Bundles which are exported to AIPs for each DSpace Item. By default, all file Bundles will be exported into Item AIPs. You could use this option to limit the size of AIPs by only exporting certain Bundles. WARNING: any bundles not included in AIPs will obviously be unable to be restored. This option expects a comma separated list of bundle names (e.g. "ORIGINAL,LICENSE,CC_LICENSE,METADATA"), or "all" if all bundles should be included. |
| both | false | If 'true', the AIP Disseminator will export an AIP which only consists of the METS Manifest file (i.e. result will be a single 'mets.xml' file). This METS Manifest contains URI references to all content files, but does not contain any content files. This option is experimental, and should never be set to 'true' if you want to be able to restore content files. |
| export-only | false | If 'true' (and the 'DSPACE-ROLES' crosswalk is enabled, see #AIP Metadata Dissemination Configurations), then the AIP Disseminator will export user password hashes (i.e. encrypted passwords) into Site AIP's METS Manifest. This would allow you to restore user's passwords from Site AIP. If 'false', then user password hashes are not stored in Site AIP, and passwords cannot be restored at a later time. |
| import-only | false | If 'true', ingestion will skip over any "Could not find a parent DSpaceObject" errors that are encountered during the ingestion process (Note: those errors will still be logged as "warning" messages in your DSpace log file). If you are performing a full site restore (or a restore of a larger Community/Collection hierarchy), you may encounter these errors if you have a larger number of Item mappings between Collections (i.e. Items which are mapped into several collections at once). When you are performing a recursive ingest, skipping these errors should not cause any problems. Once the missing parent object is ingested it will automatically restore the Item mapping that caused the error. For more information on this "Could not find a parent DSpaceObject" error see Common Issues or Error Messages. |
| export-only | unspecified | If 'skip', the AIP Disseminator will skip over any unauthorized Bundle or Bitstream encountered (i.e. it will not be added to the AIP). If 'zero', the AIP Disseminator will add a Zero-length "placeholder" file to the AIP when it encounters an unauthorized Bitstream. If unspecified (the default value), the AIP Disseminator will throw an error if an unauthorized Bundle or Bitstream is encountered. |
| export-only | unspecified | This option works as a basic form of "incremental backup". This option requires that an ISO-8601 date is specified. When specified, the AIP Disseminator will only export Item AIPs which have a last-modified date after the specified ISO-8601 date. This option has no affect on the export of Site, Community or Collection AIPs as DSpace does not record a last-modified date for Sites, Communities or Collections. For example, when this option is specified during a full-site export, the AIP Disseminator will export the Site AIP, all Community AIPs, all Collection AIPs, and only Item AIPs modified after that date and time. |
| both | Export defaults to 'true', | If 'true', every METS file in AIP will be validated before ingesting or exporting. By default, DSpace will validate everything on export, but will skip validation during import. Validation on export will ensure that all exported AIPs properly conform to the METS profile (and will throw errors if any do not). Validation on import will ensure every METS file in every AIP is first validated before importing into DSpace (this will cause the ingestion processing to take longer, but tips on speeding it up can be found in the "AIP Configurations To Improve Ingestion Speed while Validating" section below). DSpace recommends minimally validating AIPs on export. Ideally, you should validate both on export and import, but import validation is disabled by default in order to increase the speed of AIP restores. |
...
The following configurations allow you to specify what metadata is stored within each METS-based AIP. In 'dspace.cfg', the general format for each of these settings is:
...
aip.disseminate.<setting>
=
<mdType>:<DSpace-crosswalk-name>
\ [,
...
\]
}}<label-for-METS>:<DSpace-crosswalk-name>
may be specified for each setting...
mets.xsd.<abbreviation> = <namespace> <local-file-name>
<abbreviation>
is a unique abbreviation (of your choice) for this schema<namespace>
is the Schema namespace<local-file-name>
}} the full name of the cached schema file (which should reside in your {{\[dspace
\]/config/schemas/
}} directory, by default this directory does not exist -- – you will need to create it)...
The default settings are all commented out. But, they provide a full listing of all schemas currently used during validation of AIPs. In order to utilize them, uncomment the settings, download the appropriate schema file, and save it to your {{\[dspace
\]/config/schemas/
}} directory (by default this directory does not exist -- – you will need to create it) using the specified file name:
Code Block |
---|
#mets.xsd.mets = http://www.loc.gov/METS/ mets.xsd #mets.xsd.xlink = http://www.w3.org/1999/xlink xlink.xsd #mets.xsd.mods = http://www.loc.gov/mods/v3 mods.xsd #mets.xsd.xml = http://www.w3.org/XML/1998/namespace xml.xsd #mets.xsd.dc = http://purl.org/dc/elements/1.1/ dc.xsd #mets.xsd.dcterms = http://purl.org/dc/terms/ dcterms.xsd #mets.xsd.premis = http://www.loc.gov/standards/premis PREMIS.xsd #mets.xsd.premisObject = http://www.loc.gov/standards/premis PREMIS-Object.xsd #mets.xsd.premisEvent = http://www.loc.gov/standards/premis PREMIS-Event.xsd #mets.xsd.premisAgent = http://www.loc.gov/standards/premis PREMIS-Agent.xsd #mets.xsd.premisRights = http://www.loc.gov/standards/premis PREMIS-Rights.xsd |
...
Issue / Error Message | How to Fix this Problem |
---|---|
Ingest/Restore Error: "Group Administrator already exists" | If you receive this problem, you are likely attempting to Restore an Entire Site, but are not running the command in Force Replace Mode ( |
Ingest/Restore Error: "Unknown Metadata Schema encountered (mycustomschema)" | If you receive this problem, one or more of your Items is using a custom metadata schema which DSpace is currently not aware of (in the example, the schema is named "mycustomschema"). Because DSpace AIPs do not contain enough details to recreate the missing Metadata Schema, you must create it manually via the DSpace Admin UI. Please note that you only need to create the Schema. You do not need to manually create all the fields belonging to that schema, as DSpace will do that for you as it restores each AIP. Once the schema is created in DSpace, re-run your restore command. DSpace will automatically re-create all fields belonging to that custom metadata schema as it restores each Item that uses that schema. |
Ingest Error: "Could not find a parent DSpaceObject referenced as 'xxx/xxx'" | When you encounter this error message it means that an object could not be ingested/restored as it belongs to a parent object which doesn't currently exist in your DSpace instance. During a full restore process, this error can be skipped over and treated as a warning by specifying a warning by specifying the 'skipIfParentMissing=true' option (see Additional Packager Options). If you have a larger number of Items which are mapped to multiple Collections, the AIP Ingester will sometimes attempt to restore an item mapping before the Collection itself has been restored (thus throwing this error). Luckily, this is not anything to be concerned about. As soon as the Collection is restored, the Item Mapping which caused the error will also be automatically restored. So, if you encounter this error during a full restore, it is safe to bypass this error message using the 'skipIfParentMissing=true' option (see Additional Packager Options). If you have a larger number of Items which are mapped to multiple Collections, the AIP Ingester will sometimes attempt to restore an item mapping before the Collection itself has been restored (thus throwing this error). Luckily, this is not anything to be concerned about. As soon as the Collection is restored, the Item Mapping which caused the error will also be automatically restored. So, if you encounter this error during a full restore, it is safe to bypass this error message using the 'skipIfParentMissing=true' option. All your Item Mappings should still be restored correctly. All your Item Mappings should still be restored correctly. |
Submit Error: PSQLException: ERROR: duplicate key value violates unique constraint "handle_handle_key" | This error means that while submitting one or more AIPs, DSpace encountered a Handle conflict. This is a general error the may occur in DSpace if your Handle sequence has somehow become out-of-date. However, it's easy to fix. Just run the |