Date: Thu, 28 Mar 2024 15:34:36 -0400 (EDT) Message-ID: <541302300.28782.1711654476912@lyrasis1-roc-mp1> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_28781_890823464.1711654476912" ------=_Part_28781_890823464.1711654476912 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
DSpace provides a batch metadata editing tool. The batch editing tool is= able to produce a comma delimited file in the CSV format. The batch editin= g tool facilitates the user to perform the following:
For information about configuration options for the Batch Metadata Editi= ng tool, see Batch Metadata Editing Configuration
Batch metadata exports (to CSV) can be performed from the Administrative= menu:
Please see below documentation for more information on the CSV format and actions that can be perfo= rmed by editing the CSV= .
The following table summarizes the basics.
Command used: |
|
Java class: |
org.dspace.app.bulkedit.MetadataExport |
Arguments short and (long) forms): |
Description |
| Required. The filename of the resulting CSV.<= /p> |
|
The Item, Collection, or Community handle or = Database ID to export. If not specified, all items will be= exported. |
|
Include all the metadata fields that are not =
normally changed (e.g. provenance) or those fields you configured in the |
| Display the help page. |
To run the batch editing exporter, at the command line:
[dspace= ]/bin/dspace metadata-export -f name_of_file.csv -i 1023/24
Example:
[dspace= ]/bin/dspace metadata-export -f /batch_export/col_14.csv -i /1989.1/24
In the above example we have requested that a collection, assigned handl= e '1989.1/24' export the entire collection to the file 'col_14= .csv' found in the '/batch_export' directory.
Please see below documentation for more information on the =
span>CSV format =
and actions that can be performed by editing the CSV .
Importing large CSV files
It is not recommended to import CSV files of more than 1,000 lines (i.e.= 1,000 items). When importing files larger than this, it may be diffic= ult for an Administrator to accurately verify the changes that the import t= ool states it will make. In addition, depending on the memory available to = the DSpace site, large files may cause 'Out Of Memory' errors part way thro= ugh the import process.
Batch metadata imports (from CSV) can be performed from the Admini= strative menu:
The following table summarizes the basics.
Command used: |
|
Java class: |
org.dspace.app.bulkedit.MetadataImport |
Arguments short and (long) forms: |
Description |
| Required. The filename of the CSV file to loa= d. |
| Silent mode. The import function does not pro= mpt you to make sure you wish to make the changes. |
| The email address of the user. This is only r= equired when adding new items. |
|
When adding new items, the program will queue= the items up to use the Collection Workflow processes. |
| when adding new items using a workflow, send = notification emails. |
|
When adding new items, use the Collection tem= plate, if it exists. |
| Display the brief help page. |
Silent Mode should be used carefully. It is possible (and probable) that= you can overlay the wrong data and cause irreparable damage to the databas= e.
To run the batch importer, at the command line:
[dspace= ]/bin/dspace metadata-import -f name_of_file.csv
Example
[dspace= ]/bin/dspace metadata-import -f /dImport/col_14.csv
If you are wishing to upload new metadata without bitst= reams, at the command line:
[dspace= ]/bin/dspace metadata-import -f /dImport/new_file.csv -e joe@user.com -w -n= -t
In the above example we threw in all the arguments. This would add the m= etadata and engage the workflow, notification, and templates to all be appl= ied to the items that are being added.
The CSV (comma separated values) files that this tool can import and exp= ort abide by the RFC4180 CSV format. This means that new = lines, and embedded commas can be included by wrapping elements in double q= uotes. Double quotes can be included by using two double quotes. The code d= oes all this for you, and any good csv editor such as Excel or OpenOffice w= ill comply with this convention.
All CSV files are also in UTF-8 encoding in order to support all languag= es.
The first row of the CSV must define the metadata values that the rest o= f the CSV represents. The first column must always be "id" whic= h refers to the item's internal database ID. All other co= lumns are optional. The other columns contain the dublin core metadata= fields that the data is to reside.
A typical heading row looks like:
id,coll= ection,dc.title,dc.contributor,dc.date.issued,etc,etc,etc.
Subsequent rows in the csv file relate to items. A typical row might loo= k like:
350,229= 2,Item title,"Smith, John",2008
If you want to store multiple values for a given metadata element, they =
can be separated with the double-pipe '||' (or another character that you d=
efined in your modules/bulkedit.cfg
file). For example:
Horses|= |Dogs||Cats
Elements are stored in the database in the order that they appear in the= CSV file. You can use this to order elements where order may matter, such = as authors, or controlled vocabulary such as Library of Congress Subject He= adings.
If you are editing with Microsoft Exce= l, be sure to open the CSV in Unicode/UTF-8 encoding
By default, Microsoft Excel may not correctly open the CSV in Unicode/UT= F-8 encoding. This means that special characters may be improperly displaye= d and also can be "corrupted" during re-import of the CSV.
You need to tell Excel this CSV is Unicode, by importing it as follows. = (Please note these instructions are valid for MS Office 2007 and 2013. = Other Office versions may vary)
Tips to Simplify the Editing Process= p>
When editing a CSV, here's a couple of basic tips to keep in mind:=
Items can be moved between collections by editing the collection handles= in the 'collection' column. Multiple collections can be included. The firs= t collection is the 'owning collection'. The owning collection is the prima= ry collection that the item appears in. Subsequent collections (separated b= y the field separator) are treated as mapped collections. These are the sam= e as using the map item functionality in the DSpace user interface. To move= items between collections, or to edit which other collections they are map= ped to, change the data in the collection column.
New metadata-only items can be added to DSpace using the batch metadata = importer. To do this, enter a plus sign '+' in the first 'id' column. The i= mporter will then treat this as a new item. If you are using the command li= ne importer, you will need to use the -e flag to specify the user email add= ress or id of the user that is registered as submitting the items.
It is possible to perform metadata deletes across the board of certain m= etadata fields from an exported file. For example, let's say you have used = keywords (dc.subject) that need to be removed en masse. You would = leave the column (dc.subject) intact, but remove the data in the correspond= ing rows.
It is possible to perform certain 'actions' on items. This is achi= eved by adding an 'action' column to the CSV file (after the id, and collec= tion columns). There are three possible actions:
modules/bulkedit.cfg
If an action makes no change (for example, asking to withdraw an item th= at is already withdrawn) then, just like metadata that has not changed, thi= s will be ignored.
It is possible that you have data in one Dublin Core (DC) element and yo= u wish to really have it in another. An example would be that your staff ha= ve input Library of Congress Subject Headings in the Subject field (dc.subj= ect) instead of the LCSH field (dc.subject.lcsh). Follow these steps and yo= ur data is migrated upon import:
Unfortunately, this response may be caused in many ways