The Retrieval Tool is a utility which is used to transfer (or "retrieve") digital content from DuraCloud to your local file system.
Download the retrieval tool from the Downloads page.
A file containing the list of content files within a space can be created using the "list-only" option (-l) instead of retrieving the actual content files themselves. The format of this text file is one content file name per line. This can be useful for many things.
Specific content files can be retrieved from a space using the "list-file" option (-f) instead of retrieving all content files from a space. This can be useful by saving lots of time and bandwidth usage. One way to do this would be to first run a retrieval-tool command to create a file containing all content file names in a space using the "list-only" option. Then editing the text file containing the list of content names so it only contains a list of the desired content names and then use this file with the "list-file" option.
As of DuraCloud version 4.0.0, the Retrieval Tool requires Java 8 to run. The latest version of Java can be downloaded from here. |
You must have Java version 8 or above installed on your local system. If Java is not installed, or if a previous version is installed, you will need to download and install Java 8. To determine if the correct version of Java is installed, open a terminal or command prompt and enter
java -version |
The version displayed should be 1.8.0 or above. If running this command generates an error, Java is likely not installed.
To display the help for the Retrieval Tool, run
java -jar retrievaltool-{version}-driver.jar |
When running the Retrieval Tool, you will need to use these options:
Short Option | Long Option | Argument Expected | Required | Description | Default Value (if optional) |
---|---|---|---|---|---|
-h | --host | Yes | Yes | The host address of the DuraCloud DuraStore application |
|
-r | --port | Yes | No | The port of the DuraCloud DuraStore application | 443 |
-u | --username | Yes | Yes | The username necessary to perform writes to DuraStore |
|
-p | --password | Yes | No | The password necessary to perform writes to DuraStore. If not specified the retrieval tool will first check to see if an environment variable named "DURACLOUD_PASSWORD" exists, if it does exist the retrieval tool will use its value as the password, otherwise you will be prompted to enter the password. | Not set |
-i | --store-id | Yes | No | The Store ID for the DuraCloud storage provider | The default store is used |
-s | --spaces | Yes | No | The space or spaces from which content will be retrieved. Either this option or -a must be included |
|
-a | --all-spaces | No | No | Indicates that all spaces should be retrieved; if this option is included the -s option is ignored | Not set |
-c | --content-dir | Yes | Yes | Retrieved content is stored in this local directory |
|
-w | --work-dir | Yes | No | Logs and output files will be stored in the work directory. If not specified, this value will default to a directory named duracloud-retrieval-work in the user's home directory. | duracloud-retrieval-work |
-o | --overwrite | No | No | Indicates that existing local files which differ from files in DuraCloud under the same path and name sould be overwritten rather than copied | Not set |
-t | --threads | Yes | No | The number of threads in the pool used to manage file transfers | 3 |
-d | --disable-timestamps | No | No | Indicates that timestamp information found as content item properties in DuraCloud should not be applied to local files as they are retrieved. | Not set |
-l | --list-only | No | No | Indicates that the retrieval tool should create a file listing the contents of the specified space rather than downloading the actual content files. The list file will be placed in the specified content directory. One list file will be created for each specified space. | Not set |
-f | --list-file | Yes | No | Retrieve specific contents using content IDs in the specified file. The specified file should contain one content ID per line. This option can only operate on one space at a time. | Not set |
Retrieve all the files stored within the 2 specified spaces and place them in the specified local content directory under sub-directories matching the specified 2 space names.
java -jar retrievaltool-{version}-driver.jar -c content -h test.duracloud.org -u myname -p mypassword -s space1 space2 -o |
Retrieve all the files stored within all spaces and place them in the specified local content directory under sub-directories matching the space names.
java -jar retrievaltool-{version}-driver.jar -c content -h test.duracloud.org -u myname -p mypassword -a |
Create a file containing the list of content IDs for the specified spaces using hidden password option (-p command line option not specified, will be prompted for password). This example would not actually retrieve the content files, rather it creates a list of content files in the specified space. Each specified space will have its own content list file created in the specified local content directory. The naming convention of each list file created will be: "<space_id>-content-listing-<storage_provider>.txt"
java -jar retrievaltool-{version}-driver.jar -h <host> -u <user> -c <content_dir> -s <list of space IDs separated by a space> -l |
Retrieve only the specified contents by using the list-file option (-f). The -f option can only operate on one space. This command would result in having all the content files listed in the specified file of content IDs placed in the specified local content directory.
java -jar retrievaltool-{version}-driver.jar -h <host> -u <user> -c <content_dir> -f <path_to_file_of_specified_content_IDs> -s <single space ID> |