Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • DSpace's "Batch Metadata Editing" tool allows you to export sets of DSpace Item Metadata (all Items or just those in specific Communities/Collections) into a CSV file.  The metadata values/fields in the CSV file can then be edited using Microsoft Excel (or OpenOffice Calc or LibreOffice Calc).  Once editing is complete, you can re-import the modified CSV to apply the metadata changes into DSpace.
  • Step-by-step Tutorials:
  • How-To: Batch Metadata Editing basics
    • Batch Metadata Editing can be performed at a Community or Collection level
    • Browse to a specific Community or Collection. The "Context" menu will display an option for "Export Metadata".
    • Click "Export Metadata".  This will generate a CSV file that contains all the metadata for every Item within that Community or Collection hierarchy. 
      • WARNING: For extremely large communities or collections the export (and import) processes may take a long time (or cause your site to significantly slow down). Therefore, DSpaceDirect currently only allows you to modify 500 Items (i.e. lines in CSV) at a single time.  This 500 item limitation can be increased as needed, but it is not recommended (as it can cause performance issues with your site when using these tools).
    • Edit the CSV using either Microsoft Excel or OpenOffice Calc
      • More information on the CSV / Spreadsheet format is available in the DSpace Documentation section on Batch Metadata Editing
      • EXCEL WARNING: By default, Excel will not open a CSV in Unicode/UTF-8 encoding. This means that special characters may be improperly displayed and also can be "corrupted" during re-import of the CSV.
        • You need to tell Excel this CSV is Unicode, by importing it as follows:
          • Open Excel (and create an empty sheet, if one doesn't open by default)
          • Select "Data" tab
          • Click "From Text" button (in the "External Data" section)
          • Select your CSV file
          • Wizard Step 1
            • Choose "Delimited" option
            • In the "File origin" selectbox, select "65001 : Unicode (UTF-8)"
              • NOTE: these encoding options are sorted alphabetically, so "Unicode (UTF-8)" appears near the bottom of the list.
            • Click Next
          • Wizard Step 2
            • Select "Comma" as the only delimiter
            • Click Next
          • Wizard Step 3
            • Select "Text" as the "Column data format" (Unfortunately, this must be done for each column individually in Excel)
              • At a minimum, you MUST ensure all date columns (e.g. dc.date.issued) are treated as "Text" so that Excel doesn't autoconvert DSpace's YYYY-MM-DD format into MM/DD/YYYY
              • To avoid such autoconversion, it is safest to ensure each column is treated as "Text".  Unfortunately, this means selecting each column one-by-one and choosing "Text" as the "Column data format".
            • Click Finish
    • Perform your edits. Once finished, re-upload the changes to DSpace.
      • You can remove entire columns from the spreadsheet to make it easier to concentrate on editing just a few metadata fields. But, the 'id' (first) column MUST be kept.  
        • Removing an entire column with not delete that metadata (rather DSpace will just ignore it). However, please be careful to remove the ENTIRE column (including the column header). Metadata values are only deleted if you leave the column header in place but clear out one or more values (rows in a column)
      • Some metadata fields may appear duplicated with ISO language tags within the spreadsheet (e.g. "dc.subject" and "dc.subject[en_US]" columns). This is nothing to be concerned about, it simply means that some of your metadata fields specify a specific language and others do not.
        • For example, a "dc.subject" column would include subjects with no language specified; whereas, a "dc.subject[en_US]" column would include subjects with USA English specified as the language, and a "dc.subject[es]" column would include subjects with Spanish specified as the language.
        • You are welcome to move values between these columns.  Moving a value from "dc.subject" to "dc.subject[en_US]" and saving would update that value to include a language specifier of USA English. Similarly, moving a value from "dc.subject[en_US]" to "dc.subject" would update that value to include no language specifier.
      • Many more editing tips are available via the tutorials linked above.
    • Click "Import Metadata" (under "Administrative" menu)
    • Select the CSV
    • Review the changes and save the changes.
      • WARNING: Please make certain that the changes displayed on the Review screen look correct.  Once you save, you will be unable to "undo" the changes without either re-editing to metadata (or if you deleted something entirely it may need to
    • NOTE: A more detailed walkthrough with screenshots of this entire process (and additional hints) is available in the Batch Metadata Editing tutorials linked above

Bulk/Batch Content Uploads

  • DSpace allows you to batch upload content + metadata in a specific Zip package format that DSpace calls the Simple Archive Format.
  • How-to: Create an upload package
  • How-to: Upload the SAF package to DSpace
    • Keep in mind that large packages (over several GB in size) may prove difficult to upload via the web. They may timeout during upload or processing. Therefore, with DSpaceDirect, you may wish to consider creating several separate upload packages (and upload them individually) if you have a larger set of content to upload
    • Step by step upload instructions are available in the DSpace documentation at UI Batch Import (XMLUI). (Note: All DSpaceDirect sites use the DSpace XMLUI user interface)

Individual Item Permission Changes

...