Contribute to the DSpace Development Fund

The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

This page describes the enhanced/reloadable configuration feature, based on Apache Commons Configuration, which has been submitted for possible inclusion in DSpace 6.

TESTERS NEEDED!  While the basics of this functionality "work" (see PR above), this change literally changes how every configuration is read by DSpace (as Apache Commons Configuration has its own custom Property file syntax, see below for more on that). 

This means it's likely that some specific features (especially optional ones) may need to have their configuration file/settings tweaked. I've done my best to already fix the configurations of out-of-the-box features, but have not yet tested all optional features.

 

Overview

In DSpace 5 or below, DSpace used it's own custom Property-based configuration scheme, along with a custom build.properties which could tweak the build/compilation process in order to "override" some pre-selected configurations in the dspace.cfg file.  While this configuration scheme "worked" at a basic level, it required a lot of custom variable interpolation (i.e. filtering) to occur in both the Maven build process (mvn package) and the Ant install process (ant fresh_install or ant update).  The end result was that configuration files in your DSpace installation directory ([dspace.dir]) contained the correct settings from your build.properties file, but all variables (${setting}) were filled out. So, it was no longer possible to easily tweak certain key settings (like dspace.dir or solr.server) without having to either re-run the entire build process or make corrections to several files at once.

Enter Apache Commons Configuration.

The Enhanced Configuration Scheme feature uses Apache Commons Configuration (version 1.10) as the new configuration scheme for DSpace. This provides several key advantages over our old, custom configuration scheme:

  • Apache Commons Configuration is a well-established Java library whose goal is to make configuration more flexible and easier to manage.
  • It automatically interpolates all settings at runtime. This means we no longer need to replace variables (${setting}) within our configurations. They will be auto-determined at runtime based on the value of that variable within one of the configuration files For more on variable interpolation see its Basic Features documentation
  • It is a flexible configuration scheme. It can read configurations from several sources at once, including Properties files, XML config files and even database tables (see its Overview documentation).  Currently, in the DSpace Enhanced Configuration Scheme we are still only using Properties files, similar to DSpace 5 and below. But, we now be able to easily move all or some configurations to XML configs or database config tables.
    • The locations of the configuration sources can be easily customized by DSpace administrators in a new config-definition.xml file, which configures Apache Commons Configuration for DSpace. More on that below.
    • The config-definition.xml file itself is simply a "configuration definition" file as defined by Apache Commons Configuration. See the Configuration File Documentation for more details.
  • It allows for easy overriding of configuration values from other sources. How the overrides occur is up to how you've configured Apache Commons Configuration.  For DSpace, we have a new config-definition.xml which defines the following override scheme (again, this can be easily tweaked for local needs):
    • If a setting is specified in Java System Properties (e.g. -D[setting]=[value]), it overrides the same setting found in any below location
    • If a setting is specified as an Environment Variable, it overrides the same setting found in any below location
    • If a setting is specified in the new local.cfg configuration file, it overrides the default value in any below location
    • Default values for all settings are specified in the dspace.cfg and the modules/*.cfg configuration files.
  • It supports enhanced Properties files.  This means our dspace.cfg , local.cfg and other configuration files can now immediately support some enhanced options, including:
    • The ability to easily include other configuration files via: "include=[config-file-location]"  (See the end of the updated dspace.cfg for examples)
    • The ability to provide lists of values to "array" configurations by specifying the setting multiple times (rather than creating a giant comma separated configuration spanning multiple lines). For example, enabling both LDAP and Password authentication can now be done via these two lines:
      • plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.LDAPAuthentication
      • plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.PasswordAuthentication
    • For more information see the Commons Config Properties File documentation
  • More information/ features can also be found in the Apache Commons Configuration v1.10 User Guide

Building / Installing DSpace

With the Enhanced Configuration Scheme, the DSpace build process is slightly changed. The build.properties file no longer exists and therefore has no effect on the build process.

Here's how the basics of building/installing DSpace:

  • Download DSpace (as normal)
  • cd [dspace-source]
  • Create your own initial local.cfg configuration file
    • cp local.cfg.EXAMPLE local.cfg
  • The following fields MUST be specified in your local.cfg in order to install DSpace:
    • dspace.dir
    • database connection information (db.url, etc.)
  • Build/Compile/Install as normal
    • mvn clean package
    • ant fresh_install (or ant update)
  • Once DSpace is installed, your local.cfg will be copied over to your [dspace.dir]/config/ location.  At that time you can optionally tweak it further (see local.cfg documentation below)

Unlike the old build.properties, the new local.cfg has NO effect on the Maven build process.

It is ONLY used by Ant to determine the location where DSpace should be installed/updated (using dspace.dir), and also to initialize/update the database (using db.* settings).

local.cfg

The [dspace.dir]/config/local.cfg file is the new way to customize your DSpace configuration based on your local needs.

There are a few key things to note about this configuration file:

  • Any setting in your local.cfg will automatically OVERRIDE a setting of the same name in the dspace.cfg or any modules/*.cfg file.  This also means that you can copy ANY configuration (from dspace.cfg or any modules/*.cfg file) into your local.cfg to specify a new value.
    • For example, specifying dspace.url in local.cfg will override the default value of dspace.url in dspace.cfg.
    • Also, specifying oai.solr.url in local.cfg will override the default value of oai.solr.url in config/modules/oai.cfg
  • The local.cfg file is an Apache Commons Configuration Property file. For more information see the Commons Config Properties File documentation
    • This means it has enhanced features like the ability to include other config files (via "include=" statements).
  • As needed, you also are able to OVERRIDE settings in your local.cfg by specifying them as System Properties or Environment Variables.
    • For example, if you wanted to change your dspace.dir in development/staging environment, you could specify it as a System Property (e.g. -Ddspace.dir=[new-location]). This new value will override any value in both local.cfg and dspace.cfg.

An example local.cfg is provided at [dspace-source]/local.cfg.EXAMPLE. The example only provides a few key configurations which all DSpace sites are likely to need to customize. However, you may add (or remove) any other configuration to your local.cfg to customize it as you see fit.

config-definition.xml

The [dspace.dir]/config/config-definition.xml file defines the Apache Commons Configuration settings that DSpace utilizes by default. It is a valid "configuration definition" file as defined by Apache Commons Configuration. See the Configuration File Documentation for more details.

Link to config-definition.xml on Tim's DS-2654 branch: https://github.com/tdonohue/DSpace/blob/DS-2654-common-config/local.cfg.EXAMPLE

You are welcome to customize the config-definition.xml to customize your local configuration scheme as you see fit.  Any customizations to this file will require restarting your servlet container (e.g. Tomcat).

By default, the DSpace config-definition.xml file defines the following configuration:

  • All DSpace configurations are loaded via Properties files
    • Note: Apache Commons Configuration does support other configuration sources such as XML configurations or database configurations, see its Overview documentation)
  • Configuration Files/Sources: By default, only two configuration files are loaded into Apache Commons Configuration:
    • local.cfg (see documentation on local.cfg above)
    • dspace.cfg (NOTE: however that all modules/*.cfg are loaded by dspace.cfg via "include=" statements at the end of that configuration file)
  • Configuration Override Scheme: The configuration override scheme is defined as follows. Configurations specified in earlier locations will automatically override any later values:
    • System Properties (-D[setting]=[value]) override all other options
    • Environment Variables
    • local.cfg
    • dspace.cfg (and all modules/*.cfg files) contain the default values for all settings
  • Configuration Auto-Reload: By default, all configuration files are automatically checked each minute for changes. If they have changed, they are automatically reloaded.

Configuration Reloading and Caching

As noted above, by default, DSpace will now automatically reload any modified configuration file (local.cfg, dspace.cfg or modules/*.cfg) within one minute.

While the new values are immediately available within the DSpace ConfigurationService, some configurations may still be "cached" within UI-specific code. This often occurs when a UI (or API) loads a configuration value into a static variable, or otherwise implements/provides its own object caching mechanism.

The Enhanced Configuration Scheme codebase does NOT attempt to correct all these instances of caching within UIs or APIs. This would require individual configurations to be tested and any caching mechanisms to be removed.

Advanced Topics

Configuration Interpolation

This is less important to normal users of DSpace, but may be of high interest to developers and some system administrators.

It's important to be aware of the fact that variables within the following types of configurations are now AUTOMATICALLY interpolated at runtime using Apache Commons Configuration (and our ConfigurationService). This means that variables (${setting}) are no longer filtered by Maven or Ant for any of the following configuration types:

  • Configuration files (namely local.cfg, dspace.cfg and modules/*.cfg)
  • Log4j settings (namely log4j.properties)
  • Spring XML configs (namely [dspace.dir]/config/spring/api/*.xml)

There is only one remaining file type which still requires its configurations/settings to be filtered/interpolated manually:

  • All web.xml files unfortunately still need to have their ${dspace.dir} variable filtered (by Ant). This is because the dspace.dir context parameter in these web.xml files is used to initialize the DSpace Kernel (and tell the webapp where the DSpace home directory is). Unfortunately, there's no way to interpolate this value at runtime as the dspace.dir value does not exist until the Kernel and the ConfigurationService have initialized.
    • The only way we'd get around this problem would be to REQUIRE a dspace.dir ALWAYS be specified to the servlet container (as a Context parameter and/or system property).
    • In other words, the DSpace webapps cannot function/initialize without a dspace.dir.  We either need to filter a value for it (during ant update/fresh_install), or we need to REQUIRE that it be specified by other means.

Java API Changes

ConfigurationManager vs ConfigurationService

In the DSpace 5 Java  API, we had two types of Configuration objects: org.dspace.coreConfigurationManager and org.dspace.services.ConfigurationService.

While the the ConfigurationManager still exists in the API (and is still called by some areas of the codebase), it is now a "wrapper" object. It simply wraps calls to the configured ConfigurationService.

As before, the default ConfigurationService is the org.dspace.servicemanager.config.DSpaceConfigurationService (in dspace-services).

The DSpaceConfigurationService has been updated/enhanced to utilize Apache Commons Configuration, and to better align its methods with the old ConfigurationManager class.  It also has added a new reloadConfig() method which can be called on demand to automatically reload all configurations.

PluginManager vs PluginService

In DSpace 5, the org.dspace.core.PluginManager class managed all DSpace "plugin" definitions (i.e. plugin.* settings in dspace.cfg). (SIDENOTE: these DSpace "plugin" definitions are simply Java interfaces, which are then mapped to classes which implement that plugin interface).

While this concept still exists, the PluginManager itself has been entirely replaced by a new org.dspace.core.service.PluginService.

The default PluginService is a new org.dspace.core.LegacyPluginServiceImpl class, which implements the functionality of the old PluginManager.

 

  • No labels