Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Panel

Table of ContentsJump to:

Table of Contents

Purpose and Summary

...

  1. Content Modeling Architecture
  2. Module Architecture
  3. Storage
  4. Interfaces

...

Tuesday Notes

Enhanced Content Models

Asger presented an overview of the Enhanced Content Model work, and we discussed which parts made sense to fold into the core Fedora distribution. The discussion focused primarily on the extension mechanism and schema + relationship validation.

...

To drive this work forward, we identified:

Datastream Methods

We had planned on discussing Asger's proposal for adding datastream methods to Fedora, but decided to discuss this later in the interest of time.

...

Wednesday Notes

High Level Storage

Aaron presented his proposal for a high level storage interface for Fedora, describing the motivation and use cases that it enables.

...

One of the major questions that Asger's presentation provoked was whether versioning was still important to do at the datastream level, or whether it can be done at the object level.  We had a follow-on discussion about this.  The idea presented was: What if Fedora no longer held information about old versions in the DigitalObject class and in the stored FOXML?  In other words, these would be designed to work only with the current version of the components of the object.  If a datastream changed, a new version of the entire object would be made (and object-level version number would be incremented), and older versions of the datastreams would be retained only if storage was configured to do so.  While discussing this, one concern we landed on was that there would no longer be a manifest pointing to all versions of everything stored within an object.  We cut this portion of the discussion short in the interest of time.  Action: Continue this discussion with others in the Fedora community (wiki page, mailing list, etc)

To drive the high level storage work forward, we identified:

  • Lead: (smile) Aaron
  • Contrib: (thumbs up) Asger, (thumbs up) Dan, (thumbs up) Chris
  • Possible Contrib: (question) Lee Namba (re:Caching), (question) Kai, Others at FIZ (re:Versioning)

Semantic Web and Linked Data

Steve

WebDAV

Kai

Agenda

Tuesday

...

titleWelcome and Introductions (1 hr)
Panel
titleTopic: Content Modeling Architecture (4 hrs)
Panel
titleTopic: Module Architecture (1-2 hrs)
  • Report (Eddie/Chris): OSGi experience & discussion of strategy
  • Dependency injection framework: Spring?

Wednesday

Panel
titleTopic: Storage (2-2.5 hrs)
  • Proposal (Aaron): High level storage
    • Tree of stores idea (relates to multiplexing) (Asger)
    • Should versioning of datastreams go away? (Asger)
  • Hot Topics:
Panel
titleTopic: Interfaces (2-2.5 hrs)

...

titleGetting It Done (2 hrs)

Other Storage Topics

The original agenda had several "Hot Topics" defined for storage:

  • Hierarchical Storage Support
  • Multiplexing
  • In-Place Ingest
  • Large Datastreams
  • Replication and Messaging

While there was not enough time to discuss these topics by themselves, we touched on some of them as part of the high-level storage topic (hierarchical storage support, multiplexing) and the WebDAV topic (in-place ingest).

Semantic Web and Linked Data

Steve presented his latest thoughts on improving SemWeb and Linked Data support in Fedora.

We did not have much time to discuss with the group, but the following ideas seemed well recieved and worth persuing immediately:

  • (tick) Deprecate "Lite" APIs
  • (tick) HTTP URIs for RI queries: new parameter: scope = local|global, where local scope is vs. "info:" uris, and global scope is vs. "http:" uris (translated on the way in and out)

Steve also touched on the following ideas, which need some more experimentation/specification:

  • (info) Storing quads vs triples: Each object's triples are in a graph.  Performance? Compatibility w/multiple triplestores?
  • (info) Graph hierarchy: Does it make sense to have addressable graphs for each datastream as well?  Or just per-object.  Advantages/disadvantages.
  • (info) Declarative specification of which datastreams/triples to index.
    • Could be driven on a per-cmodel basis.
    • For base triples, system object methods might specify which triples to "generate".
  • (info) REST API for PUT/POST/DELETE of triples on a per-object basis.  Interestingly, this could allow for RDF management without the requirement that an index is present.  We discussed at which endpoint this logically fits: object/datastreams/datastream (straightforward to figure out where triples are stored), or object (can be figured out in some cases, but hard in other cases)

To drive this work forward, we identified:

  • Lead: (smile) Steve
  • Contrib: (thumbs up) Asger, (thumbs up) Ben
  • Possible Contrib: (question) Paul Gearon

WebDAV (or not..)

Kai led a discussion on having a WebDAV interface for Fedora.  One of the major motivating use cases was to have an easy way to ingest content into the repository.

We talked about the fact that WebDAV might not actually fit the bill here because OS-level clients (what most people would presumably be using for the "easy drag and drop case") are generally not that great.  OSX likes to do unnecessary locking for all writes.  Windows' client has been poorly supported for some time as well.

During the discussion, an alternative idea came up: If we want an easy drag-and-drop interface for the repository, how about ftp:?  OS clients' support for FTP is generally very good, and clients and servers already exist for FTP.  So the real problem for us is simplified to: How do you turn a directory full of files into Fedora objects?

This elicited a positive response from the group, especially for the simple "Drag and Drop" ingest scenario.

  • (tick) Develop a "drop box" module for the Fedora server, which scans a directory periodically and when new items come in, wraps them and ingests them as Fedora objects.  This would be a prototype initially, and wouldn't have to be developed as part of the core.

Another idea that came up was that of a "live box", where content doesn't actually get moved out of the directory it's placed into, but is pointed to as a kind of "in-place" ingest.  We just scratched the surface of this latter idea.

To drive this work forward, we identified:

  • Lead: (smile) Kai
  • Contrib: (thumbs up) Dan

Session lead: Thorny
Identify:

...

Attendees

  • Aaron Birkland (Cornell)
  • Andrew Woods (DuraSpace)
  • Asger Askov Blekinge (State & Univ Lib, Denmark)
  • Ben Armintor (Columbia U)
  • Bill Branan (DuraSpace)
  • Brad McLean (DuraSpace)
  • Chris Wilper (DuraSpace)
  • Dan Davis (Cornell)
  • Edwin Shin (MediaShelf)
  • Gert Pedersen (Tech Univ of Denmark)
  • Kai Strnad (FIZ Karlsruhe)
  • Paul Pound (UPEI)
  • Simon Lamb (Hull)
  • Stephen Bayliss (Acuity Unlimited)
  • Tim Donohue (DuraSpace)
  • Thorny Staples (DuraSpace)