Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Issues discovered in testing:


Issuebbdbnw
1

Re the orphaned chunk issue (

Jira
serverDuraSpace JIRA
serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
keyDURACLOUD-1155
), it appears that orphanned chunks are not removed when the file that is being updated is identical to the one in duracloud.    There is one way around it:  jumpstart mode.  When the synctool is in jumpstart mode, the file is retransferred and the chunks are removed. 

So the scenario we are talking about is as follows: 

  1. user uploads 20 GB file with 1 GB chunks producing 20 chunks.
  2. user uploads the same file using a pre-6.1.0 version of the synctool but changes the chunk size to 5 GBs.
  3. Now there are still 20 chunks, 0000-0003 are 5GBs,  0004-0019 are 1GB.  The 1GB chunks are the orphans.
  4. Using the new (61.0) version of the synctool,  the orphans won't be removed in non-jumpstart mode because the synctool will detect that the unchunked checksum will match the manifest.  In that case, the synctool moves on.  When jumpstart is enabled, cleanup will be invoked, but at the cost of retransferring the content.


 We have a tool for identifying orphanned chunks and removing them which we have used successfully in the recent past.    We could force the synctool to perform a cleanup when it detects matching files.  Or we could support that feature when a flag is enabled.  Or it would also be possible to prevent the cleanup, when a flag is present.  

As far as duracloud operations are concerned,  making cleanup on matching files the default behavior and providing a flag to suppress it for power users is optimal since we won't have to worry about scrubbing spaces after users upgrade to 6.1.0. 

For the users, the best option is for us not to implement "cleanup on checksum match" behavior at all and scrub their repositories once they upgrade to the new synctool.




Testing of Completed Issues

...