Contribute to the DSpace Development Fund

The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Technical Refactoring and Architectural Proposals

This is a page of possible technical refactoring proposals for moving the trunk forward towards greater modularity and plug-ability.

Refactor DSpace API into separate modules where appropriate

Why, because we cannot maintain modules independent of core that depend on a specific core release version and then release them as part of core. It may sound convoluted, but this is an example of what has happened with dspace-stats in DSpace 1.6.0. Our initial goal was to maintain the stats packages independently, depending on a specific release of DSpace services. Individual user interfaces would have still resided in the dspace-xmlui and dspace-jspui cores. But the goal was to exemplify module development occurring outside the core and easily being added into it forthe release process.

Why did this fail? This failed due to circular dependencies between dpace-statistics and dspace-api. This was something that the work in DSpace 2.0 sought to eliminate by allowing the creation of API separate from implementation that were not dependent on a central core with all the features in it. An example of this being successful is dspace-discovery. Where once it is finished, it should be only dependent on dspace-services and solr, eventually int he future it will not even be dependent on dspace-api.

DSpace API is a entangled mass of conflated functionality. It is highly recommended that DSpace API be separated apart in terms of functionality to support greater separation between content model, applications and utilities in. Proposed areas of functionality need further analysis but look like the following:

The ultimate goal in separating out this functionality it to get the core of DSpace clearly defined as a set of core Service API backed by an encapsulated implementation. DSpace Applications should only be dependent on the Services API defined and not the actual classes implementing the backend functionality.

Outlined below are some of the possible directions for refactoring once we begin this process of refactoring.

Refactor DSpace ConfigurationManager to use Configuration Service.

This proposal is centerd around coninuing an effort to break apart the exisitng core DSpace modules to support further modularity. Firstly I will present the problem.

Refactor DSpace EventManager to use EventService

DSpace EventService is currently just processing usage events, it can also process all other events, EventManager should be dropped in favor of EventService , all usages of EventManager should be replaced with EventService and event Consumers rewritten to be EventListeners attached to the EventService.

Add DSpace Scheduling Services Add Quartz Job Scheduler.

Using Quartz as a utility to manage asynchronous eventing in DSpace Services, we can setup a job scheduling environment in the DSpace webapplication that is consistent across platforms. Likewise, jobs can be managed such that they are persistent across tomcat sessions/restarts and give the Repo Admins the ability to manage the scheduling and de-scheduling of activities.

See: http://jira.dspace.org/jira/browse/DSRV-5

Benefits: centralized job scheduling, repo administrator managed,

Re-factor Harvester Multithreading to use Quartz Job Schedule

Re-factor the OAI Harvester Thread implementation to utilize the Quartz Job Scheduler

Create XMLUI User interface to view and adjust scheduling of registered Quartz Jobs.

XMLUI Aspect for administering Jobs would be accessed/listed in a "System" section of the Options

Re-factor MediaFilter manager and DSSearch/Browse indexers to run as Quartz Jobs

Browse and Search reindexing can be scheduled to happen asynchronously after the item is updated rather than on a schedule or during the request/response cycle. Making DSpace responses to users faster and more scalable.

SearchBrowse job scheduling can be used to control indexing during the importing of content such that large batch processes can be executed without browse or search indexing occuring until afterward.

  • No labels