Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Solution: Move away your customized configuration directory, start again with the fresh configuration corresponding to the DSpace version you're installing or updating to and then make your changes in that configuration.

Metadata values in CSV export seem to have duplicate columns

Symptom: In a CSV export from Batch Metadata Editing, you can often see that the same column seems to be listed twice, where each of these colums contains values in different rows. The two columns may look like this: dc.title, tc.title[]

Explanation: This is a harmless inconsistency in how DSpace stores metadata. The brackets designate which language the metadata value is in (each metadata field may have multiple values in multiple languages, which is appears in the CSV like this: dc.title, dc.title[de], dc.title[en_US]). In the database, an empty language may have two representations - either an empty string "" or a special NULL value. As DSpace organically developed over time, the programmers of various data ingest methods chose one of these two representations. If you use multiple ingest methods, you may end up with metadata that use both.

Fix 1: In the CSV file, move the values from one of the columns to the other. Then import the CSV file. Upon next export, the empty column won't appear in the CSV file anymore.

Fix 2: Use the following SQL statement to fix the data: UPDATE metadatavalue SET text_lang=NULL WHERE text_lang='';

Any of these fixes will only fix your current data. If you continue to use the two different ingest methods, you'll get new such duplicates.