Page History
Info | ||
---|---|---|
| ||
The project home for this project is: https://github.com/peterdietz/SAFBuilder |
The input for a command-line batch ingest of materials to DSpace is well documented, and is called "Simple Archive Format", however there needs to be a tool that easily facilitates creating a Simple Archive Format package. The use case satisfied with the Simple Archive Format Packager is that someone has a spreadsheet filled with metadata as well as content files that are eventually destined for repository ingest.
Thus the input to the Simple Archive Format Packager is a spreadsheet (.csv) that has the following columns:
- filename of the content file(s)
- namespace.element.qualifier metadata for the item. Examples would be: dc.description or dc.contributor.author
Further, dates need to be in ISO-8601 format in order to be properly recognized. And for metadata that has multiple values, you can separate each entry with a double-pipe "||".
While you are preparing the batch load, you have a directory containing a spreadsheet filled with metadata and content files.
Obtaining, Compiling, and Running SAFBuilder
The SAFBuilder project reside on GitHub. Check out the source code, recompile it, and run it. Using this application will probably require that you have already downloaded and installed Java's JDK. It is possible to run the SAFBuilder from Windows, however the commands to do so are not detailed here.
From the terminal:
Code Block |
---|
git clone git://github.com/peterdietz/SAFBuilder.git cd SAFBuilder ./recompile.sh ./safbuilder.sh |
The final command will then give you the arguments used to invoke the program.
Panel |
---|
USAGE: BatchProcess /path/to/directory metadatafilename.csv |
There is sample data included with the tool to give an idea of how to use this.
To run the tool over the sample data:
Code Block |
---|
./safbuilder.sh /path/to/SAFBuilder/src/edu/osu/kb/sample_data AAA_batch-metadata.csv |
This creates the SimpleArchiveFormat directory inside of the directory specified, along with subdirectories, content files, metadata files that is ready to import into DSpace.
This is then immediately ready to be batch imported into DSpace. An example DSpace import command is.
Code Block |
---|
sudo /dspace/bin/dspace import -a -e dietz.72@osu.edu -c 1811/49710 -s /home/peterdietz/Desktop/MelanieSeedsBatch/SimpleArchiveFormat/ -m /home/peterdietz/Desktop/MelanieSeedsBatch/seedsbatch1.map |
Further Work
This packager works as a stand-alone tool, and requires knowledge of Java to be able to run. Thus satisfying the initial need to be able to package many items to be batch loaded into DSpace, using DSpace's launcher item-import. So the remaining goal of this project is to streamline the process of batch loading materials into DSpace.
Possibilities include:
- refactoring so that it can become a Packager Plugin. Packager plugins allow you to implement a way for DSpace to accept an input package (containing content files, manifest, and metadata) that then creates DSpace items.
- creating a client GUI for the desktop.