Call Details

Attendees

Agenda

(If you have an agenda suggestion/addition, please leave a comment!)

    1. Communicating as a group: Slack invitations sent out. Anyone else that should be invited to the channel?
    2. Status of current/planned development
      1. DuraSpace
        1. Multipart upload
        2. Streaming (July-Aug) and transcoding (Aug-Sept)
        3. New Kanban board to track activity
      2. 4Science
        1. S3 One as a distinct storage provider
        2. Other storage providers (ExoScaleCynnyspace, others?)
        3. Cross region duplication
      3. TDL
    3. Secondary storage providers
      1. Secondary provider use cases - use of secondary providers now and user expectations
      2. Secondary provider vs primary provider (which providers fit into each category): S3, Glacier, Chronopolis, DPN, S3 One, ExoScale, CynnySpace
      3. Considerations/Requirements when adding a new provider
    4. TDL and 4Science experience with use and implementation of DuraCloud
      1. Pain points
      2. Customer feedback/response
      3. How could this be made easier in the future?
    5. Customer requested features (DuraSpace, TDL)
      1. Features that have been requested
      2. For each feature, assign someone to ensure the feature request is captured in JIRA
    6. Operational needs
      1. From last time: better release notes, more informative error handling, better sync tool tooltips, options to customize emails for registration in management console.
    7. Approach for reviewing and prioritizing JIRA list
    8. Activity planning for community sprint in October

Reminders

  1. Sprint planning day: October 5th
  2. Development sprint: October 15-26

Minutes

  1. Communicating as a group: Slack invitations sent out. Anyone else that should be invited to the channel?
    1. Everyone usually on the calls is there. 
  2. Status of current/planned development
    1. DuraSpace
      1. Multipart upload: started primarily on the retrieval side, added ability to pull content in pieces. Still quite a bit of work to be done on the API for multi-part. Bill has had limited time for it
      2. Streaming (July-Aug) and transcoding (Aug-Sept) sprints coming up for customer paying for streaming updates. Will ensure they can integrate their systems as well.
      3. New Kanban board to track activity to show work in progress. Different from other Kanban boards even in JIRA, separates backlog to another page. Backlog is in menu on the lefthand side. Can pull from backlog to in progress. Can add tickets or pull in existing tickets for work that is being done elsewhere (4Science, TDL, etc.)
    2. 4Science
        1. 4Science working on S3-1, it is working in DuraCloud Europe. 
        2. Some work is overlapping with what DuraSpace is doing to remove old providers
        3. Need to determine what needs to be repaired that might have been missed with removing old providers. DuraSpace should wait for 4Science to finish their work first
        4. New class, S3 secondary provider, to allow another provider in a different region (for DC EU, S3 Ireland is replicated to S3 Paris). 
        5. Working on an S3 compatible provider. Still working through issues for this, figure out a way to replace cloudwatch, other provider does not have equivalent
          1. Bill: Glacier has the same issue, it is pulled from the mill manifest
          2.  Andrea working to come up with another solution
          3. Andrea will create tickets (https://jira.duraspace.org/browse/DURACLOUD-1198https://jira.duraspace.org/browse/DURACLOUD-1199https://jira.duraspace.org/browse/DURACLOUD-1200)
    3. TDL
      1. TDL is working on a new interface for their limited service. Another model for replication task suite. Can be shared.
        1. Have been working to set up sprints now that they have a full staff. Nick will work on sprint in the fall. 
        2. TDL wrote a letter of support for a Canadian consortium requesting a grant to implement DuraCloud in Canada. Would likely provide code for alternative implementations. 
  3. Secondary storage providers
    1. Secondary provider use cases - use of secondary providers now and user expectations
      1. Primary at this point is always S3. DPN or Chronopolis can also be primary. One DC customer uses both Chronopolis and S3. Duplication is pretty much always set up to go into secondary storage.
      2. Glacier is not primary because it does not allow immediate retrieval. 
      3. Not easy to restore content from Glacier. Not notified when the content is available. A different Glacier provider, Glacier Vault, uses different API. In Europe it is challenge to find different providers from Amazon, and this is why they are working on support for S3 in different regions. 
    2. Secondary provider vs primary provider (which providers fit into each category): S3, Glacier, Chronopolis, DPN, S3 One, ExoScale, CynnySpace
      1. S3-1 can work as both primary and secondary storage. Thinking of having S3-1 for users who have other replication elsewhere. 
      2. Makes sense to have more options. DuraCloud has not had primary outside of Amazon. Initial drop of content is always in S3. 
      3. Libraries for open stack have not been good in our experience
      4. Costs of streaming content would come through outbound from EC2 before it lands in CynnySpace
      5. TDL: users who have Glacier set up as primary storage providers and add DPN, etc run into issues creating spaces
      6. Not everyone wants the replication, some want the concept of a dual primary storage area. Something that could be made optional. 
      7. Likely that TDL will have more members who want Chronopolis for one set of content and DPN for another.  Would be useful to manage this administratively. 
      8. TDL folks will create a ticket to keep track of this need
      9. May be a good focus for the sprint in October
  4. TDL and 4Science experience with use and implementation of DuraCloud
    1. Pain points
      1. TDL has compiled a list. Secondary storage provider issue; open sourcing the mill deployment scripts; sync tool error handling and clearing stored configs, tool tips; customization of emails; upgrade dependencies (Javascript library updating). 
      2. Client Javascript code is pretty old. Could take a lot of time for little value. 
  5. Approach for reviewing and prioritizing JIRA list
    1. Doing work for our own environments, but there are existing items that would be a problem for all of us. How to resource getting that work done
    2. How to prioritize? In this meeting, or outside this meeting?
    3. Voting on issues? 
    4. Can ID higher priority issues that are on your list
    5. Need to know what items are most important
    6. Courtney: this is similar to Datapoint, and how they work
    7. Help developers know what is effecting customers
    8. Can break out tickets or drag to the top of the backlog? Tag/label tickets?
    9. Can sort by the tag/label
    10. Other calls in the future will likely be shorter (30 min or so)

Actions

  • Andrea will create tickets for existing 4Science work
  • TDL folks will create a ticket to keep track of the need for dual primary storage and dual Chronopolis/DPN storage
  • Bill will determine if creating labels would be an effective approach for prioritization 
  • No labels