Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

DSpace is implemented as a set of Maven 2 projects with a parent-child relationship and a module that manages the final assembly of the project. Due to the specifics of DSpace (database dependency, filesystem file system dependency) we need to set up the test environment before running any tests. While the fragmentation in projects of DSpace is good design and desirable, that means we have to replicate the configuration settings for each project, making it much less maintenablemaintainable.

There is another issue related to the way Maven works. Maven defines a type of package for each project (jar, war or pom). Pom projects can contain subprojects, but their lifecycle skips all the test steps. This means that even if a Pom project would be ideal to place the tests, they would be not be run and we can't force Maven to run them by any means.

...

We found that some methods and/or classes have a dependency on the filesystem file system of DSpace. This means we have to replicate the filesystemfile system, including some configuration files, before being able to run the tests.

...

The solution has been to duplicate the filesystem file system in the dspace-test project. This replica is copied to the temp folder (a folder designed by the tester via a configuration file) before launching the tests. Once the tests finish, the files are removed. This is not an ideal solution as requires tester to duplicate files, but is a workaround while we find a definite solution.

...

Dspace heavily depends on the file dspace.cfg for its configuration. For testing purposes we have crafted a test version of this file, available in the resources folder of dspace-test. This file is loadedduring loaded during setup instead of the default dspace.cfg so the environment is set for unit testing.

As the assembly process is run later than the test goal in Maven, we can't use external profiles to replace values. This means the values in thsi this file are hardcoded hard-coded and might need to be changed for a specific system. The way it is set up by default, it should work on all *nix systems as it sets the /tmp/ folder as the temporal test folder. If this has to be changed, the file test-config.properties will also need to be updated.

...

We found that many classes have a direct dependency to the database. Specifically there is a huge coupling with both Oracle and PostgreSQL, with many pieces of code being executed or not depending on the database used. Mocking the connections is not easy due to the heavy use of default-access constructors and relations between clases classes that are not following Demeter's Law.This means we need a database to run the tests, something not really desirable but required.

While the perfect solution would be to migrate DSpace to an ORM like Hibernate, there is not time to do so in this project, and this would be too much of a change to add to the source. The decison decision has been made to use an in-memory database where the DSpace database will be recreated for the purpose of unit testing. Once the tests are finished, the database will be wiped and closed.

...

  • removal of the function that provides the next value of a sequence
  • removal of clausule clause "WITH TIME ZONE" from TIMESTAMP values
  • removal of DEFAULT NEXTVAL('<seq>') constructs due to incompatibility. DatabaseManager has been changed to add the proper ID to the column. Proposed to change the affected valued to IDENTITY values, that include autoincrement.
  • removal of UNIQUE constructs due to incompatibility. Tests will need to verify uniqueness
  • replaced BIGSERIAL by BIGINT
  • replacing getnextid for NEXTVAL on an INSERT for epersongroup
  • due to the parsing process some spaces have been added at the start of some lines to avoid syntax errors

Due to H2 requiring the column names in capital letters the database is defined as an Oracle database for DSpace (db.name) and the Oarcle Oracle compatibility mode for H2 has been enabled.

...

As a note, the usage of a DDL language via DDLUtils has been tested and discarded due to multiple issues. The codebase code base of DDL Utils is ancient, and not compatible with H2. This required us to use HSQLDB, which in turn required us to change some tables definitions due to syntax incompatibilities. Also, we discovered DDL Utils wasn't generating a correct DDL file, not providing relevant metainformation meta-information like which column of a table was a primary key or the existing sequences. Due to the reliance of DatabaseManager on this metainformationmeta-information, some methods were broken, giving wrong values. It seems that more recent code is available from the project SVN, but this code can't be recovered from Maven repositories, which would make much more cumbersome the usage of unit testing in DSpace as the developer would be required to download the code, compile it and store it in the local repository before being able to do a test. A lot of effort has been put to use the DDL, but in the end we feel using the database_schema.sql file is better.

...

In this project we want to enable unit tests, integration tests and functional tests for DSpace. Maven 2 has a non-modifiable lifecycle life-cycle that doesn't allow us to run tests once the project has been packaged.This same lifecycle life-cycle doesn't allow us to launch an embedded server like Jetty to run the functional tests.

...

This option is not optimal, but due to the limitations impossed imposed by DSpace system and Maven we have not find a better solution. Any proposals are appreciated.

...

We can consider two types of classes when developing the unit tests: classes which have a dependency on the database and classes that don't. The classes that don't can be tested easily, using standard procedures and tests. Our main problem are classes tighly tightly coupled with the database and its helper objects, like BitstreamFormat or the classes that inherit from DSpaceObject. To run the unit tests we need a database but we don't want to set up a standard PostgreSQL instance. Our decision is to use an in-memory database that will be used to emulate PostgreSQL.

To achieve this we mock DatabaseManager and we replace the connector to point to our in-memory database. In this clase class we also initialise the replica with the proper data.

...

About the implementation, several objects only offer a hidden constructor and a factory method to create an instance of the object. This means we efectively effectively have to create them using available factory methods. Other specifics have been commented above, like:

  • Usage of a temporal filesystemfile system
  • Usage of an in-memory database (h2)
  • Mocking the DatabaseManager class

...

During the development the following issues have been detected in the code, which make Unit Testing harder and impact the maintability maintainability of the code:

* Hidden dependencies. Many objects require other objects (like DatabaseManager) but the dependency is never explicitely explicitly declared or set. These dependencies should be fulfilled as parameters in the constructors or factory methods.

...

Integration tests use the same structure as Unit tests. A class has been created, called AbstractIntegrationTest, that inherits from AbstractUnitTest. This provides the integration tests with the same temporal filesystem file system and in-memory database as the unit tests. The class AbstractIntegrationTest is created just in case we may need some extra escafolding scaffolding for these tests. All integration tests should inherit from it to both distinguish themselves from unit tests and in case we require specific changes for them.

...