Date & Time

This call is a Community Forum call: Sharing best practices and challenges in the use of existing DSpace features

Dial-in

We will use the international conference call dial-in. Please follow directions below.

Agenda

Community Forum Call: DSpace Importing and Bulk Metadata Editing

Sharing best practices, challenges, and questions

 

Preparing for the call

Bring your questions/comments you would like to discuss to the call, or add them to the comments of this meeting page.

If you can join the call, or are willing to comment on the topics submitted via the meeting page, please add your name, institution, and repository URL to the Call Attendees section below.

Meeting notes

Batch Metadata Editing

DSpace offers a default batch metadata editing feature which allows administrators to export metadata in a CSV file. This CSV file can be imported in a spreadsheet application, after which the metadata can be altered. After editing, administrators can reconvert the file to a CSV file, and import it back in DSpace.

Georgetown University created several tools as an extension of the standard DSpace batch editing functionality. These tools will become part of the DSpace 6 codebase.

Georgetown University created tools for:

UTF8 encoding issue

When using the batch metadata functionality, metadata sometimes gets corrupted when the CSV file is imported in a spreadsheet application. This is caused by some characters not being imported correctly as UTF8, which automatically results in an erroneous metadata value when the metadata is exported as a CSV file. Even if the metadata value was not altered.

According to DCAT this is due to a lack of correct encoding support by (certain)spreadsheet applications.

Openrefine

Throughout the discussion participants often mentioned OpenRefine (http://openrefine.org/) as a great application for editing CSV exports. This tool could be interesting to such an extend it may be useful to organize a workshop on the application. This workshop could be an extension of the OpenRefine workshop organized by Code4lib. DCAT members having more information on the Code4lib Openrefine workshop are invited to share their knowledge, or (links to) any affiliated documents, in the comments.

DSpace Bulk ingest & export

Simple Archive Format

DSpace offers bulk ingest through Simple Archive Format. This is an archive containing a directory for each item. Each item directory consists out of a file containing the file's metadata together with all of the item's bitstreams.

Exporting search results

DSpace 6 will come with new bulk exporting functionality, being a new tool allowing to export search results.

Blank spaces

There was the concern of blank values being introduced after exporting a CSV file out of a spreadsheet application. While in the original CSV file there was no value for a certain metadata field, there may be a blank value in the CSV file exported out of the spreadsheet editor. This however should not not be a problem as it is unlikely the DSpace batch metadata editing tool will insert a value for this blank when the CSV fil is imported in DSpace.

Call Attendees