You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

Introduction

This document describes the technical aspects of the testing integrated into Dspace. In it we describe the tools used as well as how to use them and the solutions applied to some issues found during development. It's intended to serve as a reference for the community so more test cases can be created.

Note the document is a work in progress and will change as the implementation evolves.

Issues

During implementation we found several issues, which are described in this section, along the solution implemented to work around them. 

Multiple Maven projects

DSpace is implemented as a set of Maven 2 projects with a parent-child relationship and a module that manages the final assembly of the project. Due to the specifics of DSpace (database dependency, file system dependency) we need to set up the test environment before running any tests. While the fragmentation in projects of DSpace is good design and desirable, that means we have to replicate the configuration settings for each project, making it much less maintainable.

There is another issue related to the way Maven works. Maven defines a type of package for each project (jar, war or pom). Pom projects can contain subprojects, but their lifecycle skips all the test steps. This means that even if a Pom project would be ideal to place the tests, they would be not be run and we can't force Maven to run them by any means.

The perfect solution would be to run the tests in the Assembly project, but due to the mentioned limitations (it is a Pom project) this can't work.

To solve this issue we have created a new project, dspace-test, which will contain only unit tests. It has dependencies on the projects being tested and all the settings required to initialize DSpace. All tests must be added to this project, as it is the only one with the proper dependencies.

File system dependency

We found that some methods and/or classes have a dependency on the file system of DSpace. This means we have to replicate the file system, including some configuration files, before being able to run the tests.

The main issue is that the existing files we need are in the Assembly project, but we can't run the tests there. Also, we can't use an assembly to copy the files as the test phase runs before any packaging or install phase in Maven.

The solution has been to duplicate the file system in the dspace-test project. This replica is copied to the temp folder (a folder designed by the tester via a configuration file) before launching the tests. Once the tests finish, the files are removed. This is not an ideal solution as requires tester to duplicate files, but is a workaround while we find a definite solution.

The files are stored under the folder dspaceFolder in the resources folder. All the contents of this folder will be copied to the temporal area. The file test-config.properties contains settings to specify the temporal folder and other test values.

File configuration dependency

Dspace heavily depends on the file dspace.cfg for its configuration. For testing purposes we have crafted a test version of this file, available in the resources folder of dspace-test. This file is loaded during setup instead of the default dspace.cfg so the environment is set for unit testing.

As the assembly process is run later than the test goal in Maven, we can't use external profiles to replace values. This means the values in this file are hard-coded and might need to be changed for a specific system. The way it is set up by default, it should work on all *nix systems as it sets the /tmp/ folder as the temporal test folder. If this has to be changed, the file test-config.properties will also need to be updated.

Database dependency

We found that many classes have a direct dependency to the database. Specifically there is a huge coupling with both Oracle and PostgreSQL, with many pieces of code being executed or not depending on the database used. Mocking the connections is not easy due to the heavy use of default-access constructors and relations between classes that are not following Demeter's Law.This means we need a database to run the tests, something not really desirable but required.

While the perfect solution would be to migrate DSpace to an ORM like Hibernate, there is not time to do so in this project, and this would be too much of a change to add to the source. The decision has been made to use an in-memory database where the DSpace database will be recreated for the purpose of unit testing. Once the tests are finished, the database will be wiped and closed.

Taking in account the coupling with Oracle/PostgreSQL and the need to recreate the database, the solution has been to create a mock of the DatabaseManager class. This mock implements the exact same functionality of the base class (being a complete copy) and adds some initialization code to reconstruct the tables and default contents.

The database used is H2 . The information is stored in a copy of the database_schema.sql file for PostgreSQL with the following modifications:

  • removal of the function that provides the next value of a sequence
  • removal of clause "WITH TIME ZONE" from TIMESTAMP values
  • removal of DEFAULT NEXTVAL('<seq>') constructs due to incompatibility. DatabaseManager has been changed to add the proper ID to the column. Proposed to change the affected valued to IDENTITY values, that include autoincrement.
  • removal of UNIQUE constructs due to incompatibility. Tests will need to verify uniqueness
  • replaced BIGSERIAL by BIGINT
  • replacing getnextid for NEXTVAL on an INSERT for epersongroup
  • due to the parsing process some spaces have been added at the start of some lines to avoid syntax errors

Due to H2 requiring the column names in capital letters the database is defined as an Oracle database for DSpace (db.name) and the Oracle compatibility mode for H2 has been enabled.

The code in the mock DatabaseManager has been changed so the queries are compatible with H2. The changes are minimal as H2 is mostly compatible with PostgreSQL and Oracle.

As a note, the usage of a DDL language via DDLUtils has been tested and discarded due to multiple issues. The code base of DDL Utils is ancient, and not compatible with H2. This required us to use HSQLDB, which in turn required us to change some tables definitions due to syntax incompatibilities. Also, we discovered DDL Utils wasn't generating a correct DDL file, not providing relevant meta-information like which column of a table was a primary key or the existing sequences. Due to the reliance of DatabaseManager on this meta-information, some methods were broken, giving wrong values. It seems that more recent code is available from the project SVN, but this code can't be recovered from Maven repositories, which would make much more cumbersome the usage of unit testing in DSpace as the developer would be required to download the code, compile it and store it in the local repository before being able to do a test. A lot of effort has been put to use the DDL, but in the end we feel using the database_schema.sql file is better.

Different tests

In this project we want to enable unit tests, integration tests and functional tests for DSpace. Maven 2 has a non-modifiable life-cycle that doesn't allow us to run tests once the project has been packaged.This same life-cycle doesn't allow us to launch an embedded server like Jetty to run the functional tests.

The solution to this is to create 2 infrastructures, one for the unit and integration tests and one for the functional tests. Unit and integration tests will be run by Maven using the Surefire plugin. Functional tests will be run once the program has been build from an Ant task. This will allow us to launch an embedded server to run the tests.

This option is not optimal, but due to the limitations imposed by DSpace system and Maven we have not find a better solution. Any proposals are appreciated.

The unit and integration tests solution has been implemented in the dspace-test project.

The functional tests implementation is in process.

Common Tools

There is a set of tools used by all the tests. These tools will be described in this section.

Maven

The build tool for DSpace, Maven, will also be used to run the tests. For this we will use the Surefire plugin, which allows us to launch automatically tests included in the "test" folder of the project. We also include the Surefire-reports plugin in case you are not using a Continous Integration environment that can read the output and generate the reports.

The plugin has been configured to ignore test files whose name starts with "Abstract", that way we can create a hierarchy of classes and group common elements to various tests (like certain mocks or configuration settings) in a parent class.

Tests in Maven are usually added into src/test, like in src/test/java/<package> with resources at src/test/resources.

To run the tests execute:

mvn test

The tests will also be run during a normal Maven build cycle. To skip the tests, run Maven like:

mvn package -Dmaven.test.skip=true

By default we will disable running the tests, as they might slow the compilation cycle for developers. They can be activated using the command

mvn package -Dmaven.test.skip=true

or by changing the property "activeByDefault" at the corresponding profile (skiptests) in the main pom.xml file, at the root of the project.

JUnit

JUnit is a testing framework for Java applications. It was one of the first testing frameworks for Java and it's a widespread use in the community. The framework simplifies the development of unit tests and the current IDE's make even easier building those tests from existing classes and running them.

Junit 4.8.1 is added as a dependency in the parent project. The dependency needs to be propagated to the subprojects that contain tests to be run.

As of JUnit 4.4, Harmcrest is included. Harmcrest is a library of matcher objects that facilitate the validation of conditions in the tests.

JMockit

JMockit is popular and powerful mocking framework. Unlike other mocking frameworks it can mock final classes and methods, static methods, constructors and other code fragments that can't be mocked using other frameworks.

JMockit 0.998 has been added to the project to provide a mocking framework to the tests.

Unit Tests Implementation

These are tests which test just how one object works. Typically test each method on an object for expected output in several situations. They are executed exclusively at the API level.

We can consider two types of classes when developing the unit tests: classes which have a dependency on the database and classes that don't. The classes that don't can be tested easily, using standard procedures and tests. Our main problem are classes tightly coupled with the database and its helper objects, like BitstreamFormat or the classes that inherit from DSpaceObject. To run the unit tests we need a database but we don't want to set up a standard PostgreSQL instance. Our decision is to use an in-memory database that will be used to emulate PostgreSQL.

To achieve this we mock DatabaseManager and we replace the connector to point to our in-memory database. In this class we also initialise the replica with the proper data.

Structure

There is a base class called "AbstractUnitTest". This class contains a series of mocks and references which are necessary to run the tests in DSpace, like mocks of the DatabaseManager object. All Unit Tests should inherit this class, located under the package "org.dspace" in the test folder of DSpace-test.

About the implementation, several objects only offer a hidden constructor and a factory method to create an instance of the object. This means we effectively have to create them using available factory methods. Other specifics have been commented above, like:

  • Usage of a temporal file system
  • Usage of an in-memory database (h2)
  • Mocking the DatabaseManager class

Code Issues

During the development the following issues have been detected in the code, which make Unit Testing harder and impact the maintainability of the code:

* Hidden dependencies. Many objects require other objects (like DatabaseManager) but the dependency is never explicitly declared or set. These dependencies should be fulfilled as parameters in the constructors or factory methods.

* Hidden constructors. It would be advisable to have public constructors in the objects and provide a Factory class that manages the instantiation of all required entities. This would facilitate testing and provide a separation of concerns, as the inclusion of the factory methods inside objects usually adds hidden dependencies (see previous point).

Refactoring would be required to fix these issues.

Integration Tests

These tests work at the API level and test the interaction of components within the system. Some examples, placing an item into a collection, or creating a new metadata schema and adding some fields. Primarily these tests operate at the API level ignoring the interface components above it.

Integration tests use the same structure as Unit tests. A class has been created, called AbstractIntegrationTest, that inherits from AbstractUnitTest. This provides the integration tests with the same temporal file system and in-memory database as the unit tests. The class AbstractIntegrationTest is created just in case we may need some extra scaffolding for these tests. All integration tests should inherit from it to both distinguish themselves from unit tests and in case we require specific changes for them.

The main difference between these and the unit tests is in the test implemented, not in the infrastructure required, as these tests will use several classes at once to emulate a user action.

Functional Tests

These are tests which come from user-based use cases. Such as a user wants to search DSpace to find material and download a pdf. Or something more complex like a user wants to submit their thesis to DSpace and follow it through the approval process. These are stories about how users would preform tasks and cover a wide array of components within Dspace.

Thanks

This page has been created with help from Stuart Lewis, Scott Phillips and Gareth Waller. I want to thank them all for their comments. Some information has been taken from Wikipedia to make the text more complete. I'm to blame for errors in the text.

Feel free to contribute to this page!

  • No labels