Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • When the Retrieval Tool starts up, it connects to DuraCloud using the connection parameters you provide and gets a list of content items in the spaces you indicate. It will then proceed to download the files from those spaces, each into a local directory named for the space, which is placed within the content directory.
  • For each content item, the Retrieval Tool checks to see if there is already a local file with the same name. If so, the checksums of the two files are compared to determine if the local file is the same as the file in DuraCloud. If they match, nothing is done, and the Retrieval Tool moves on to the next file. If they do not match, the file from DuraCloud is retrieved.
  • By default, when a local file exists and differs from the DuraCloud copy, the local file is renamed prior to the DuraCloud file being retrieved. If you would prefer that the local file simply be overwritten, you will need to include the overwrite command-line flag when starting the Retrieval Tool.
  • As each content file is downloaded, a checksum comparison is made to ensure that the downloaded file matches the file in DuraCloud. If the checksums do not match, the file is downloaded again. This re-download will occur up to 5 times. If the checksums still do not match after the fifth attempt, a failure is indicated in the output file.
  • As each file download completes, a new line is added to the retrieval tool output file in the work directory, indicating whether the download was successful or not. Files which did not change are not included in the output file.
  • As the Retrieval Tool runs, it will print its status approximately every 10 minutes to indicate how many files have been checked and downloaded.
  • Once all files are retrieved, the Retrieval Tool will print its final status to the command line and exit.
  • As files are updated in DuraCloud, you can re-run the Retrieval Tool using the same content directory, and only the files which have been added or updated since the last run of the tool will be downloaded.
  • Highlight
    colorgreen
    New in 2.4
    A file containing the list of content files within a space can be created using the "list-only" option (-l) instead of retrieving the actual content files themselves.  The format of this text file is one content file name per line.  This can be useful for many things.

  • Highlight
    colorgreen
    New in 2.4
     Specific content files can be retrieved from a space using the "list-file" option (-f) instead of retrieving all content files from a space.  This can be useful by saving lots of time and bandwidth usage.  One way to do this would be to first run a retrieval-tool command to create a file containing all content file names in a space using the "list-only" option and then . Then editing the text file containing the list of content names so it only contains a list of the desired content names and then use this file with the "list-file" option.

Operational notes

  • Content Directory - the directory to which files will be downloaded. A new directory within the content directory will be created for each space.
  • Work Directory - the work directory contains both logs, which give granular information about the process, and output files. A new output file is createdc for each run of the Retrieval Tool, and it stores a listing of the files which were downloaded.

...