Date: Thursday February 4, 9am PST

Attendees

  • Mark Bussey
  • Adam Wead
  • Longshou Situ
  • Vivian Chu
  • Tom Johnson
  • Rob Sanderson
  • Esme Cowles

Agenda

  1. Goals of API (and SPI) work
    1. Defining an API with a clear spec, versioned independent of the implementation, etc.
      1. Including HTTP API, messaging
    2. Having a spec opens up the possibility of multiple implementations with different priorities
    3. We could use the existing (4.5.0) API as the baseline of the spec, and be thoughtful about changes going forward?
      1. Don't expect dramatic API change, but do expect some changes
      2. Maybe we should codify existing API, and also plan a new version that improves parts of the API, and have a predictable process for moving from one to the other
    4. Would like more predictability of API changes
      1. There are release candidates available for 2-4 weeks, and testing against them would help identify breaking changes earlier
      2. DCE supports multiple projects on multiple Fedora releases, and needs to manage changes
    5. How much of the API is stable?  How do people know about upcoming changes?
      1. Some changes (e.g. removing JCR types) known about long in advance, could improve communication and predictability
      2. The weekly Fedora committers call is a good way to know about changes, but too high an overhead for many people to participate in
      3. Roughly quarterly meetings (HydraConnect, LDCX, etc.) would be more convenient
      4. In favor of frequent releases, but not breaking changes
        1. Would like breaking changes to be less frequent and better communicated, to make it easier to test and adapt to them
  2. Discussion of proposed services, in the context of Hydra
    1. CRUD
      1. Aligned with LDP, so already specified
      2. Fedora's HTTP API docs also cover the particular implementation choices (e.g., Prefer headers supported)
      3. Fedora complies with the LDP spec and wants to keep compliant
    2. Fixity checking
      1. On upload, you can provide a checksum and it will be verified
        1. Hydra doesn't support this now, but it could
        2. May want to have a slightly different approach: upload and checksum at the same time, and then compare checksums
      2. On demand, you can check that the resource on disk matches the recorded checksum
    3. Versioning
      1. Existing versioning API Fedora-specific
      2. The implementation is efficient and full-featured
      3. Implementing it might complicate other implementations
      4. The API spec should specify how an implementation that didn't support versioning would behave
        1. Or the API spec could require versioning, since many storage backends support versioning
      5. Would like to use the Memento API for version retrieval
        1. But there is no Memento spec for how to create versions
        2. Marmotta's Memento implementation isn't LDP-aligned, it just auto-versions triples
        3. Fedora could auto-version metadata to avoid needing to create them explicitly
          1. Non-versioning backends could just report the current version following the Memento spec
        4. But Fedora would need to have explicit versioning for binaries because storage concerns
        5. Fedora also has an API to restore versions
          1. But that could be a COPY from the old version URI to the current URI
      6. ActiveFedora has limited support for versioning (files only), so need to support metadata versioning, subtree versioning
        1. Now would be a good time to change the API, since Hydra isn't really using it now
      7. Would be good to include the broader LDP community into the versioning API discussion to encourage a LDP-wide versioning approach
      8. Wouldn't mind having auto-versioning, but would still like to be able to tag/label specific versions
      9. Don't want lots of extra versions of files because I version the metadata that links to it
        1. ActiveFedora can control this and decide when to create versions and/or label versions
        2. ACTION: Esme: Check whether creating a version of a tree also creates distinct versions of unchanged files
    4. Transactions
      1. Would like to consider all the changes in a transaction as a version
        1. Can do this now by opening a transaction, making changes, creating a version, and then committing the transaction
      2. Somewhat awkward for RESTful API, so there is probably not an existing standard
      3. The current API is a good strawperson
        1. Haven't heard any complaints about the API, non-Hydra clients are using it
      4. Current discussion about what aspects of ACID Fedora supports
        1. Definitely Atomicity and Durability
          1. Atomicity might require all items to happen at the same time – would be hard to support in a distributed environment
          2. Want to make it as easy as possible to support diverse backends and scalability requirements
        2. Consistency and Isolation might be limited
          1. Different implementations might have different levels (e.g. snapshot isolation vs. read-uncommitted), and implementations should advertise what they support
      5. ACID is a set of guarantees for all updates, not just transactions, so it's important to consider them more broadly
    5. Authorization
      1. Fedora provides authorization, but Hydra (for historical reasons) doesn't take advantage of it
      2. Hydra does use WebACLs, but the implementation is different from what Fedora expects, so they are not compatible
        1. We should align them so Fedora could enforce Hydra's WebACLs for other clients
        2. Hydra also currently cannot provide the user who is making a request, which would be needed to enforce the WebACLs
          1. ActiveFedora would need to be refactored to allow per-request identification of the end user making the request
          2. ACTION: Adam and Esme will compare Fedora and Hydra WebACLs to see where they differ
      3. Fedora authorization assumes either the client or the servlet container is handling authentication and group membership information
      4. If there is an IndirectContainer, I shouldn't be able to use it to add triples to resources I don't have permission to write to
        1. ACTION: Rob will create a ticket to investigate this
  3. Other API concerns
    1. Would like to have some kind of packaged version of all of the resources that make up a Work
      1. There is a Camel component that can sync updates to a triplestore, disk, etc. which might meet this need.
      2. An RDF import/export functionality (like the current JCR/XML import/export functionality) would also meet this need, and could be a useful bulk edit API to address other concerns about the performance of editing multiple related resources.
    2. Muti-resource CRUD
      1. PCDM and Hydra Works mean that many users who used to have a single Fedora 3 object now have many Fedora 4 resources.
      2. It would be great to have LDP community agreement on how this should work
      3. We can all join the LDP next mailing list and discuss our approach, and then implement it in Fedora

Reference

Notes

A project tracking the Fedora API in Rubyhttps://github.com/fcrepo4-labs/derby

 

  • No labels