Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Version 1.7.0

...

Tip
titleDSpace 1.7.0 was officially released to the public on December 17, 2010.

DSpace 1.7.0 can be downloaded immediately at either of the following locations:

Warning
titleRecommended to Upgrade to DSpace 1.7.1

We recommend all DSpace 1.7.0 users upgrade to DSpace 1.7.1, to receive several bug fixes along with a medium-level security fix. For more information see:

Table of Contents
minLevel2
outlinetrue
stylenone

 , which is planned for release in December 2010.Contributors are strongly encouraged to obtain the source code using Subversion (SVN). This is very straightforward, and we've published a guide to doing so here: ContributionGuidelines

Info

DSpace 1.7.0 is a scheduled, "time-based" release. In order to decrease delays in releasing new features and increase transparency, the DSpace Developers have decided to schedule scheduled 1.7.0 in advance and base based its features on what we are were able to complete within that timeframe. So, Despite the fact that 1.7.0 will be a departure from 1.6.0 in that it may include fewer new features overall, but will be completed in a much tighter timeframehad a much tighter timeframe than previous major releases, the developers have managed to include some significant new features, numerous bug fixes and performance improvements.

Scheduling releases will benefit benefits us all as it should decrease the delays in releasing new features, and increase the transparency of the development process. The DSpace Developers feel that these benefits will far outweigh the cost of potentially having fewer major features in a given DSpace release. We hope the DSpace Community will also realize the immediate benefits, which should allow them to receive new features more quickly, rather than potentially waiting years for the next major release of the software.

Organizational

Release Coordination

  • Release Coordinator: Peter Dietz, Ohio State University Libraries

1.7.0 Code Committing Rules:

  1. No incomplete features in "Trunk", ever. If you are in-progress on a feature, create an SVN sandbox or branch area to work on it, and pull it over to Trunk once it has been completed and is ready for testing & release. (This will hopefully help us to avoid encountering a situation where we need to revert a large, unfinished patch at the last moment.)
    • Modules which do not reside in DSpace "Trunk" should also follow this rule. If there is major rework to be done on a module, create a "branch" to do that work and migrate it back to the Module's Trunk once it is ready for release.
  2. All new features must have documentation before committing to Trunk. If a feature has no documentation, it should not be committed to Trunk until there is some minimal documentation (minimal documentation includes documenting all configuration options). This will help us all ensure that Documentation is ready by the time we get to 1.7.0 RC1, and hopefully lessen the time that any one person has to spend cleaning up or rewriting documentation.

Timeline and Proceeding

Proposed Release Timeline (all dates are tentative):

  • August 13, 2010 : Milestone 1 - "Feature Decision Day"
    • By this milestone, all major features or major architectural changes for 1.7.0 release should be approved by the DSpace Committers Group and be somewhere in SVN (they need not be fully finished, but should be moving along in their development process).
  • October 22, 2010 : Feature Freeze (all features must have initial documentation to be accepted)
    • All 1.7.0 features (major and minor) must be finished, committed to Trunk and have initial documentation. After this date, no new features will be accepted for the 1.7.0 release. Any features which are not finished or ready will need to be scheduled for the next DSpace release.
    • Modules which do not reside in Trunk (e.g. dspace-services) should also undergo a Feature Freeze on this date, so that we can work to stabilize all code used by out-of-the-box DSpace.
  • October 29, 2010 : Final Documentation "Due Date"
    • All documentation changes need to be submitted, so that they can be cleaned & packaged up in preparation for RC1. This includes Upgrade Documentation, Release Notes, etc.
      • Obviously, if errors are found in docs, they can be changed during testing process, etc. But, the goal here is to attempt to have as close to a final version of the Documentation as possible, in preparation for the RC1 release.
  • November 5, 2010 : Release Candidate 1
  • November 8-12, 2010 : 1.7 Testathon Week
  • December 3, 2010 : Release Candidate 2 (if necessary), or Final Release
  • December 6-15, 2010 : Final Testing / Bug Fixing (if necessary)
  • December 17, 2010 : Final Release

Release Process needs to proceed according to the following Maven release process ReleaseProcedure

New Features

Panel

NOTE: The source code is still volatile as fixes and improvements are still flowing in.

  • AIP Backup / Restore (Tim Donohue/DuraSpace) – Allows for a more complete backup of DSpace into generic METS-based packages known as Archival Information Packages (AIP's). These AIPs could also be used to migrate DSpace content (Communities/Collections/Items) between DSpace and non-DSpace instances that support AIP.
  • DSpace Discovery, (contributed by @mire, NV.) A Solr based, faceted search layer to provide a deeper, and more intuitive look at repository contents.
  • Unit Testing, which will allow each code component to be able to be tested so that it does what it intends to do.
  • Most Used Items list, which can replace or complement the existing Recently Submitted Items list.
  • PowerPoint Text Extraction Media Filter - Allows for full-text searching of PowerPoint bitstreams

New Features in XMLUI

  • Two XMLUI themes (contributed by @mire, NV.)
    • A function library to aid xmlui theme development which refactors the existing dri2xhtml.
    • Mirage, a theme which is another new look utilizing the new xmlui function library

New Features in JSPUI

  • Added the ability to redirect to the current page after authentication

Improvements

Graham Triggs has launched a Code Quality crusade and has dug into deep and dark corners of the DSpace source code armed with nothing more than a torch, machete, and a fancy plugin called QAplug to systematically drive out evil spirits haunting the DSpace code. See: https://jira.duraspace.org/browse/DS-707

The DSpace Developers hope to continue this trend of "time based" releases with all future releases.

New features in DSpace 1.7

Image Added

Mirage, a clean and professional looking theme for XMLUI.
dri2xhtml-alt, and xmlui theme development framework that eases XMLUI theme development.
See Mirage in action at http://demo.dspace.org/xmlui. For a video demo see: http://www.youtube.com/watch?v=Pq3d_oD-4aM

Mirage was contributed by @mire, NV.

Image Added

Discover, a faceted browsing and searching interface that gives a deeper and more intuitive look at repository contents.
See Discovery in action at http://demo.dspace.org/xmlui. For more information, watch the video introduction to Discovery.

Discovery was contributed by @mire, NV.

Image Added

Archival Information Package (AIP) Backup & Restore process. Allows for a backup of DSpace into a generic METS-based structure, that can be used to migrate DSpace content to another system that supports AIP's (DSpace or non-DSpace). This backup and restore functionality also allows one to backup to cloud storage services like DuraCloud, though it could just as easily be used to backup to tape or a hard drive.

Added by Tim Donohue (DuraSpace)

Image Added

Curation System , a framework for building and running tasks to help a Curator preserve and improve your repository contents.
Tasks can be run on communities, collections, and items through the command line for cron-tasks, or through the User Interface for admins.
The initial tasks available are:* Profile Bitstream Formats -- counts the number of bitstreams that share the same file format extension. 

  • Virus Scan -- inspect the bitstreams with a virus scanner (ClamAV) to detect if they contain viruses
  • Check for Required Metadata -- checks that item metadata has values for all fields marked as required in the input-form

    Added by Richard Rodgers (MIT)

Image Added

Automated Unit Testing of core code -- helps the developers ensure that DSpace is as bug free and stable as possible. Unit Testing coupled with continuous integration on our bamboo server allows us to validate every change to the DSpace code base. Thus letting us know immediately if something changed broke another feature.

Added by Pere Villega, a product of DSpace Summer of Code2 2010 (mentor Stuart Lewis).

 

Improved Google Scholar metadata exposure. Additional citation_ tags have been exposed to allow the Google Scholar crawler to find better associate repository metadata and PDF content.

Added by Sands Fish, Richard Rodgers (MIT) and Peter Dietz (Ohio State)

 

PowerPoint text extraction, for searching within PowerPoint slides

Added by Keith Gilbertson (Georgia Tech)

 

Top 10 Most Visited items list, available for the overall site.

Performance Improvements

Performance and Scalability improvements. The code has been thoroughly analyzed by a suite of code quality tools to find blatant errors and omissions, more efficient ways of doing things, and implementing general best practices in the code. Additionally, numerous immeasurable performance gains have been made with regard to item ingestion and indexing speed. This was tested by adding a sample-data-generator, in which 400,000 items were added to a repository already containing 100,000 items, where by the total length of time to ingest items reduced to items per second, as opposed to seconds per item. Adding so many items would previously taken weeks or more, but the latest performance feat was done in 10 hours on a laptop. See more at DS-707.

Many thanks go out to Graham Triggs (BioMed Central) for many sleepless weeks to vastly overhaul many weak links.

In addition to that, many other In addition to that, the rest of the general improvements are:

  • Reducing the cost of browse prunes
  • SOLR is using autoCommit to reduce resource exhaustion
  • SOLR has an optimization function runnable from the command-line for crontasks
  • Switched to use autoCommit which reduces resource exhaustion
  • Added solr.optimize which can called via command line to essentially "defragment" the solr index, resulting in slightly better solr performance over time.
  • Item bitstream sorting/ordering can be specified according to sequence or name
  • Moving the documentation into the Confluence wiki so that workload can be divided, and that the documentation is improved.

Bug Fixes

Major Bug Fixes include:

  • Batch Metadata Import will now validate metadata fields in CSV's
  • Restricted items / metadata is better protected from exposure via web services: OAI
  • File handle leak in ItemImporter closed.  Fixes issues when max_files_open exceeded on some systems.
  • Database connections released when no longer needed in xmlui BitstreamReader.  Fixes problem getting connections from the database pool while simultaneously downloading multiple large files.

Changes

See the For a full list of all changes (new features, improvements, and bug fixes), please visit the Changes in DSpace 1.7.0 JIRA Page for a list of all currently proposed changes section of the new wiki-based DSpace Documentation.

Removed / Deprecated

...

Most command line scripts that have historically resided in \[dspace\]/bin/ were deprecated in 1.6.x, and are now removed in 1.7.0. They have been replaced with the configurable command launcher, which eases the cross platform development of scripts. Discussion at: \[http://jira.dspace.org/jira/browse/DS-646\|Full details of the discussion are at: http://jira.dspace.org/jira/browse/DS-646\].

Example usage of the launcher:

One can no longer runThe old way will no longer work, as the task scripts have been removed:

Code Block

[dspace]/bin/create-administrator

The method of calling the DSpace launcher is nowfunctionality is all performed by the centralized DSpace launcher, e.g.:

Code Block

[dspace]/bin/dspace create-administrator

Unsure / Postponed for Next Release

Note: Developers, you need to contact the release coordinator and get approval to commit a feature that did not make it in by the feature freeze date. The sooner you tackle this the better the chance that your code will not cause a problem with the release schedule.

There are a few projects that were strongly desired to make it into DSpace 1.7, however, work will need to be done to make them into the release, or they are going to be postponed for the next release of DSpace, likely 1.8.

Rewrite of Creative Commons licensing (MIT - ready to go) – would improve upon the features of the current CC licensing submission step
* Currently only against XMLUI from MIT
* Legacy problem – do we update old license to new or not? Currently MIT runs 'split version' with old licenses looking like old, and new look like new.

Google Scholar work (MIT - ready to go) – better metadata for Google Scholar (citation tags in header).

Wiki Markup
\*\[CGIProposal\|CGIProposal\]* (Richard Rodgers/MIT -- Interface & XML serialization implementation should be ready), based on the **\[*{*}Item type based submission patch{*}*\|http://jira.dspace.org/jira/browse/DS-464\]** picked up by Robin Taylor (initially a GSoC project) -- would allow for type-based submission processes (e.g. Theses/Dissertations could have different submission steps than articles/papers).*
*Context Guided Ingest -- define an interface, where any submission code can write "attributes" and can retrieve those again later on. Can add any new attributes/values that you want for your submission code. Could be serialized to XML (using input-forms.xml) OR have an implementation of that service that stores in DB (recommended). JPA2?*
*seems similar to SimpleStorage Service (user centered storage of state info) -- Mark Diggory.*

Wiki Markup
*\*\[*{*}CurationTaskProposal{*}*\|CurationTaskProposal\]* (Richard Rodgers/MIT -- some will be ready) -- would allow for a more standard way to integrate curation tools (e.g. virus scanning, format identification, etc) into DSpace Lightweight framework to attach curation tasks -- 3-4 tasks for 1.7. These could be kicked off by commandline (in batch) or Admin UI (potentially) 1. Automated replication
2. Streamlined Checksum Checker
3. Virus Checker - ClamAV (Tcp socket communication)
4. ??? Better content format identification (may not be ready for 1.7)
Could relate to Scheduler service (Spring based) in Modules area. Allows you to register & schedule events. -- Mark Diggory

Wiki Markup
Possibly one or more of the \[Google Summer of Code\|Google Summer of Code\] 2010 projects (if they are ready/stable enough for release)
 \- REST API?  -- Based on previous discussion on developers list / irc channel it will be most probably released asynchronously. It is in the process but RC 1 deadline is too short for all parts to be implemented. Bojan Suzic --

SWORD Client for DSpace? (Robin Taylor – may be ready, Richard Jones & Stuart Lewis are interested in helping) – would allow DSpace to push/submit content to other SWORD enabled repositories

  • For closed & open access repositories – add a button to transfer content from a closed to an open repository.

Calling a command by its full classname still works by adding dsrun before the classname.

Code Block
[dspace]/bin/dspace dsrun org.dspace.administer.CreateAdministrator

Organizational Details

Release Coordination

  • Release Coordinator: Peter Dietz, Ohio State University Libraries
  • Release Coordinator: Tim Donohue, DuraSpace

Timeline and Proceeding

Release Timeline:

  • (tick) August 13, 2010 : Milestone 1 - "Feature Decision Day"
  • (tick) October 22, 2010 : Feature Freeze
  • (tick) October 29, 2010 : Final Documentation "Due Date"
  • (tick) November 5, 2010 : Release Candidate 1
  • (tick) November 8-19, 2010 : 1.7 Testathon
  • (tick) December 3, 2010 : Release Candidate 2
  • (tick) December 6-15, 2010 : Final Testing / Bug Fixing
  • (tick) December 17, 2010 : Final Release

Release Process needs to proceed according to the following Maven release process: Release Procedure

Postponed for a Future Release

The following projects were considered for 1.7, but were not stable enough to be included. They need further review and development from the stakeholders before they are suitable for widespread use, they may be considered for a future release of DSpace. The next release they will be reconsidered for is 1.8

  • REST API - Using standard web services to CRUD DSpace Objects. A product of previous GSOC.
  • SWORD Client for DSpace – (Robin Taylor, and possibly Richard Jones & Stuart Lewis)
    • would allow DSpace to push/submit content to other SWORD enabled repositories
    • For closed & open access repositories – add a button to transfer content from a closed to an open repository.
  • CGIProposal(Richard Rodgers/MIT)
    • would allow for type-based submission processes (e.g. Theses/Dissertations could have different submission steps than articles/papers).
    • based on the Item type based submission patch picked up by Robin Taylor (initially a GSoC project)
  • Context Guided Ingest – define an interface, where any submission code can write "attributes" and can retrieve those again later on. Can add any new attributes/values that you want for your submission code. Could be serialized to XML (using input-forms.xml) OR have an implementation of that service that stores in DB (recommended). JPA2?
    • seems similar to SimpleStorage Service (user centered storage of state info) – Mark Diggory.
  • Rewrite of Creative Commons licensing (MIT)
    • would improve upon the features of the current CC licensing submission step
    • Currently only against XMLUI from MIT
    • Legacy problem – do we update old license to new or not? Currently MIT runs 'split version' with old licenses looking like old, and new look like new.

...