Developers Meeting before OR2014 on Mon, June 9, 2014

Face-to-face developer's meeting on topic of Fedora 4 before OR2014 in Helsinki, Finland.

Who is invited?

  • All Committers,
  • Any other interested Fedora developers or technology-savvy individuals

If you don't fall into one of the above categories, you are still welcome to attend. However, be warned that discussion will likely get very technical at times (which is why we recommend you be a developer or have a technology background).

Logistics

Possible Agenda Topics

Please feel free to add your own ideas to our agenda (which will be finalized as we get closer to OR14)

  1. One-click feature walk-through
    • Building?
    • Data loading?
  2. Technical issues
    • Resource structure in Fedora: flat vs. hierarchical (transparency?)
    • Clustering use cases
    • Authorization
    • Ease of deployment
    • Metrics
    • Transparent filesystem
    • Audit logs
  3. Dev challenge topics
    • IIIF
    • ResourceSync
    • OAI-PMH
    • If-This-Than-That
    • Sequencers
  4. ...

Attendees

Agenda

9:00am : Introductions

  • Name, Institution, How are you using Fedora 4, what are your plans
  • Assign a notetaker!

9:15am - 10:30am : ...

  • How to get involved in development
  • Use cases
  • Beta efforts

10:30am - 11:00am : Coffee Break

  • Coffee to be provided

11:00am - 12:30pm: ...

  • Installation of Beta
  • Performance and technical decisions
  • Dev challenge

Minutes

  • Introductions
    • Andrew Woods
      • Fedora Tech Lead
    • Michael Durbin
      • UVa planning to migrate F3 to F4 within the next year
    • Fulgencio Sanmartín
      • Application over Fedora and semantic database for metadata
      • Will present on Friday
      • Using Fedora 3.6.1
      • Interested in F4 because of transactions
    • Wilhelm Frank
      • Using eScidoc on F3
      • Looking forward to F4
      • Working on a search tool
      • Concerns about scalability
    • Kai Sternad
      • F3 committer
      • Asynchronously committing to F4
      • F4 documentation is great, easy to get started
    • David Wilcox
      • Fedora Product Manager
    • Chris Beer
      • F4 developer
      • Working on Hydra and Spotlight
    • Chad Mills
      • Developer
      • Using Fedora since 2005
      • On Fedora 3.6.2
      • Objects have 20+ datastreams
      • Organization upgrades versions very slowly
      • Wants to test the Beta
    • Kathryn Cassidy
      • Developer
      • Using Fedora/Hydra
      • Interested in Fedora 4
    • Peter Tiernan
      • Systems and storage engineer
      • DevOps
      • Interested in Fedora 4 preservation, federation, object interface
    • Stuart Kenney 
      • Developer from Trinity College Dublin
    • Jimmy Tang
      • Also from Trinity College Dublin
    • Eric James
      • Developer
      • Using Fedora 3 for several years
      • Hydra partner
      • Using Fedora for OAI, preservation (PREMIS)
      • Building a preservation environment, looking for collaborators
  • How to get involved in development
    • There are no dedicated Fedora developers on staff at DuraSpace
    • Only Andrew and David are hired to work on the project
    • We need a deeper development team bench
    • We work in sprints (2 week commitments)
    • Jan-June, July-Dec
    • New developers should commit to 3 sprints (at least) because the first sprint will likely be ramp-up
    • Will anyone join as a developer?
      • Some are unfamiliar with Java/Fedora
      • Some are waiting for others to migrate before they do it
      • Fedora 4.0 does not target migrations
    • Use cases
      • Transactions
        • Important for updates to millions of objects
      • Statistics
        • Unpopular objects could be moved to slower, backend storage
        • There is a metrics library now
          • Java service that intercepts a request and pushes to an external service
      • Storage back-ends
        • Cloud storage (Amazon S3)
        • Some work has gone into connecting to different back-end stores
        • F4 supports federation
        • Simple Java API for read/write operations with some out-of-the-box implementations
        • File system federation exists now
        • Others are possible
        • Synchronous back-end stores are closer to being supported
        • Asynchronous storage is blocked by a Java library (Jersey) that needs to be updated
          • Ben started this work but was unable to finish
      • Migrations
        • Migration utility that transforms RELS-EXT into a path structure (or namespaces into a path structure)
      • Differences between Fedora 3 and Fedora 4
        • Object model
          • Fedora 4 uses a hierarchy
          • Objects with millions of children could present usability/performance issues
          • Benefit: backup/access control/disk allocation all much easier with hierarchy
          • Objects still have relationships and identifiers
        • Properties
          • RDF assertions (name/value pairs)
            • Name is a namespaces vocabulary (standard or arbitrary)
            • Value is typed
          • Can change a single value without updating the entire record
          • Native metadata search (rather than needing GSearch/Solr)
          • A request on a Fedora object returns RDF
            • The subject is the item
            • The predicate is the name
            • The object is the value
          • Can be serialized as RDF/XML, N3, JSON-LD
        • Content models
          • New properties and children can be added
          • This can be open, or you can define expected properties and children objects
            • Can’t currently impose restrictions
          • Can’t create invalid objects - if an object does not meet requirements it does not get created
      • Beta efforts
        • The plan is to release the 4.0 production release this year
        • Getting from beta to production will require testers/pilots
      • Beta installation
        • Everyone installs either the one-click-run or the full install
      • Testing scenarios
        • Clustering
          • Setup two servers make sure it is highly available
          • Clustering is a beta feature
            • Works well in term of redundancy but not load balancing
          • Fedora 3 vs. Fedora 4 comparison
      • Feature discussion
        • Identifiers
          • PID-minters are available (and more can be written)
          • Right now, objects are structured based on name
            • Performance degrades when more than 3,000 objects exist in a given directory
          • Should the user create the hierarchy, or should the hierarchy be auto-generated?
            • For auto-generated hierarchies, should they be transparent to the user?
              • Should support both
              • If a user provides an PID, the hierarchy should still be generated
              • The user should only see the PID they chose, not the entire hierarchy
            • There is a translation layer that separates the identifier the user sees from the back-end storage path 
          • Current use cases
            • Internal Fedora PID generator
        • File system
          • On disk, data is stored as JSON and serialized Java objects
          • Would a more transparent file system be better?
          • Export format is JCR/XML
          • Federation is a strong preservation use case because you can maintain a preservation-friendly file system

 

  • No labels