Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. In DuraCloud, content is added by the user into a Snapshot Storage Provider space. This is a staging area that is backed by S3.
  2. The user selects a button in the space to create snapshot and enters snapshot metadata
    1. error: no communication with Bridge server. Does the S3 instance need to be connected when the user logs in? Yes. Want to be able to display list of snapshots that have been taken - made by making a call down to the bridge. 
  3. The DuraCloud UI calls the storage provider snapshot task indicating space to snapshot
  4. The snapshot task creates snapshot properties file and stores it in snapshot space
    1. When the snapshot properties file is added, the space is transitioned to read-only
  5. The snapshot task calls to the bridge application to indicate that a snapshot needs to be taken, providing DuraCloud host/port/space.
  6. The bridge application adds an entry to the snapshot db table with the details of the snapshot action
    1. error: communication failure between database and the bridge. Snapshot would fail, throw an error back up to DuraCloud
  7. The bridge application connects to DuraCloud and copies all content from DuraCloud space to bridge storage
    1. During transfer, content properties are captured in a file
    2. During transfer, each content item is added to content db table (with snapshot id)
    3. Error- bridge server restarted, process would be killed. Working on piece that would allow the Bridge to maintain initialization parameter and restart where it left off. 
  8. The bridge application creates two manifest files (md5 and sha256) for the content and verifies all content was transferred correctly
  9. The bridge application sends a notification (email) to Chronopolis that a snapshot is ready (this step may be replaced with Chron intake polling). 
    1. Error - notification gets sent, not picked up or poling doesn't happen. Chron job or the like to make sure the polling is always happening. 
  10. Chronopolis Intake service uses the content in bridge storage to construct a DPN bag 
    1. Error - corrupted files, report that to the bridge server. Bill working on additional call on bridge server to indicate error in steps 10/11. Sets snapshot in error state, sends an email to the DuraCloud team. Also potential of error of depositor depositing more than their allotment. 
    2. Communication issues between intake and ingest server
    3. The Intake service validates content against the manifest written by the bridge application
    4. The Intake service creates the necessary bag files (bagit, bag-info, dpn-info) that are included in the bag
    5. If the content contained in the snapshot is larger than 250 GB, multiple bags are created
  11. Chronopolis Ingest service performs replication to other Chronopolis nodes and DPN
    1. Ace Tokens are created for other Chronopolis Nodes
      1. Token writing doesn't happen 
    2. An entry in the DPN registry is added
      1. Communication issues
    3. REST calls are used between DPN nodes to discover content which needs to be replicated
    4. Error - any individual node in Chronopolis failing to replicate, nodes in DPN failing to replicate. Hard to pinpoint exact reasons. Recovery for DPN is to make a new request. Would we know if there was a failure? No. Would need to be added to the DPN failure code. 
  12. Chronopolis makes a call to the bridge application to indicate that content has been successfully copied to preservation storage
    1. Intake service checks for existing snapshots to see if they could be completed. 
    2. Information about DPN IDs should be passed into the Bridge.
    3. Errors - call doesn't get made, connection problems. 
  13. The bridge application deletes the directory in bridge storage used for the snapshot
  14. The bridge application makes a call to a task in the DuraCloud Snapshot Storage Provider to indicate that it is now time to clean up the snapshot content
    1. DuraCloud application could be down. No retry or notification in place.  
  15. The cleanup task sets a policy on the underlying S3 bucket which causes the content to be removed within 24 hours
    1. S3 unavailable, policy can't be set, no retry or notification in place. Could check date on snapshot transition to most recent status. If it's been more than xx days then problem notification sent. 
  16. The bridge application watches the snapshot space, and when it becomes empty, calls the snapshot complete task, which clears the S3 bucket policy
  17. The bridge application notifies the user who requested the snapshot that it has been completed

...