Contents

DSpace Architectural Review

w/o 23 Oct 2006

I. attendees

mackenzie smith* mit libs
john erickson* hplabs
john mark ockerboom* penn
richard jones* imperial college
scott phillips* texas a&m
jim downing* u cambridge
richard rodgers* mit libs
graham trigg* biomed central
Gabriela Mircea* toronto
henry jerez* cnri
rob tansley* google
Mark Diggory* mit libs

II. welcome/introduction

III. review of materials

IV. Review of Manifesto

see: http://wiki.dspace.org/index.php/ArchReviewWorkingPrinciples

1. DSpace is primarily open source software for building digital repositories.
DSpace is intended to be free and open source software for digital repositories that enables services for access, provision, stewardship and re-use of digital assets with a focus on educational and research materials; i.e. to fulfill the mission of the DSpace Foundation.

(no discussion)

2. DSpace will be usable based purely on free and open source software.
Although setups including custom and/or proprietary features and technologies will be possible, it will always be possible to deploy DSpace using only free and open source software.

3. DSpace will have a decoupled, stable, and application-neutral core.
DSpace will attempt to identify a "core" of the system that supports a wide variety of applications, whose full scope is not bounded unnecessarily . It will define stable APIs to enable diverse and innovative applications and functionality to be built on this core, without need to modify the source code of the core.

4. Whilst usable for a variety of applications, DSpace will retain useful "out of the box" functionality for common use cases.
DSpace cannot support all the variable and emerging definitions and innovations in the repository space in a single interface application. DSpace will seek to provide out-of-the-box functionality for a common set of use cases (e.g. an OA preprints application, a general content archive) that can be installed with minimum possible effort, as well as modular support for the easy construction of new applications.

5. DSpace will employ and support existing, open standards where possible.
Wherever possible, DSpace will employ and support existing terminology, open standards and profiles. This is to promote interoperability of various kinds with other systems, and support the migration of data into or out of other systems.

6. DSpace releases should be minimally disruptive.
The architecture should reinforce good behavior in making changes/customizations/improvements to future releases of the system, so that upgrades are minimally disruptive for current adopters.

7. DSpace will support an exit strategy for content.
It will be possible to export all data necessary for the future re-use and stewardship of content held in a DSpace repository, in open and/or well-documented formats, for enabling migration into other systems and/or backup.

8. DSpace will continue to evolve.
There are many unsolved problems associated with stewardship of digital materials, which will require research and experimentation (including some failures) to solve. In addition to providing a robust, stable and functional system, DSpace will enable innovation and experimentation, and will be designed with the knowledge that future development and re-architecting will inevitably be necessary.

BREAK

V. Review of Requirements

1. Use of DSpace

2. Maturity

3. Version

4. Use priorities/content in dspace

5. Metadata

6. updating

7. Customization

8. Comfortabe customizing?

9. Documentation

10. customization options?

11. Core code changes

12. User interface

13. third party customizations

14. Handles

15. More important feature/function

16. Main Problems

MK: talked to LoC

jmo: some at high-end want more from data model/application features

data model vs. ui?

hj: what is the canonical dspace use case?

What are the minimal changes to the data model that are the most useful

What about scalability?

Other server/infrastructure things that DSpace doesn't know what to do

SP: Can we put down size that we would like node of dspace to scale to?

SP: Cambridge --** topping out with ~175,000 (many things breaking)
md/rob: questions of oai implementation

we need more well-designed tests to identify true bottlenecks

Support for different "Activities"

dm: binding of permissions to items

interoperability

deposit (uk thing)

TODO: Wee need STATEMENTS about interoperability and scalability

VI: Issues List
added issues:
jd: distribution/release management

ms: larger issue

jmo: What form of guidelines "published" to the community?

LUNCH

VII. Dealing with Issues

jmo: having "strawman" proposals for each of these would be

See: http://wiki.dspace.org/index.php/ArchReviewIssues

1. Data Model/Information Architecture (Tuesday)

2. Interfaces and Modularity (Tuesday)

4. Information Lifecycle management/workflow (Wednesday)

3. Concrete Model/Asset Repository (Thursday)

BREAK

VIII: Scoping Goals

See also: Bakery of Half-baked Ideas

1. Scalability: DSpace should exhibit "reasonable performance" under the following conditions