Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: fix typo

...

DSpace uses a relational database to store all information about the organization of content, metadata about the content, information about e-people and authorization, and the state of currently-running workflows. The DSpace system also uses the relational database in order to maintain indices that users can browse.

Warning
title1.8 Schema

This Database schema is not fully up to date with DSpace 4.0. 2 additional table to store item level versioning information were added to DSpace 3 are currently not represented in this diagram.

DSpace 5 database schema (Postgres). Click on the thumbnail, then right-click the image and choose "Save as" to save in full resolution. Instructions on updating this schema diagram are in How to update database schema diagram.

DSpace 5 database schemaImage AddedImage Removed

Most of the functionality that DSpace uses can be offered by any standard SQL database that supports transactions. However at this time, DSpace APIS use some features specific to PostgreSQL and Oracle, so some modification to the code would be needed before DSpace would function fully with an alternative database back-end.

...

The database schema used by DSpace is initialized and upgraded automatically using Flyway DB. The DatabaseUtils class manages all Flyway API calls, and executes the SQL migrations under the org.dspace.storage.rdbms.sqlmigration package and the Java migrations under the org.dspace.storage.rdbms.migration package.  While Flyway is automatically initialized and executed during the initialization of DatabaseManager, various Database Utilities are also available on the command line.

New in DSpace 5.

...

x: Metadata for all DSpace objects

There were several changes between the DSpace 4 and 5 database schema, related to the new "Metadata for all Dspace objects" improvements. Full detail can be found here:

Metadata for all DSpace objects 

Maintenance and Backup

When using PostgreSQL, it'

Maintenance and Backup

When using PostgreSQL, it's a good idea to perform regular 'vacuuming' of the database to optimize performance. By default, PostgreSQL performs automatic vacuuming on your behalf.  However, if you have this feature disabled, then we recommend scheduling the vacuumdb command to run on a regular basis.

...

The database manager is configured with the following properties in dspace.cfg:

...

db.url

...

The JDBC URL to use for accessing the database. This should not point to a connection pool, since DSpace already implements a connection pool.

...

db.driver

...

JDBC driver class name. Since presently, DSpace uses PostgreSQL-specific features, this should be org.postgresql.Driver.

...

db.username

...

Username to use when accessing the database.

the following properties in dspace.cfg:

db.url

The JDBC URL to use for accessing the database. This should not point to a connection pool, since DSpace already implements a connection pool.

db.driver

JDBC driver class name. Since presently, DSpace uses PostgreSQL-specific features, this should be org.postgresql.Driver.

db.username

Username to use when accessing the database.

db.password

Corresponding password ot use when accessing the database.

Custom RDBMS tables, colums or views

When at all possible, we recommend creating custom database tables or views within a separate schema from the DSpace database tables. Since the DSpace database is initialized and upgraded automatically using Flyway DB, the upgrade process may stumble or throw errors if you've directly modified the DSpace database schema, views or tables.  Flyway itself assumes it has full control over the DSpace database schema, and it is not "smart" enough to know what to do when it encounters a locally customized database.

That being said, if you absolutely need to customize your database tables, columns or views, it is possible to create custom Flyway migration scripts, which should make your customizations easier to manage in future upgrades.  (Keep in mind though, that you may still need to maintain/update your custom Flyway migration scripts if they ever conflict directly with future DSpace database changes. The only way to "future proof" your local database changes is to try and make them as independent as possible, and avoid directly modifying the DSpace database schema as much as possible.)

If you wish to add custom Flyway migrations, they may be added to the following locations:

  • Custom Flyway SQL migrations may be added anywhere under the org.dspace.storage.rdbms.sqlmigration package (e.g. [src]/dspace-api/src/main/resources/org/dspace/storage/rdbms/sqlmigration or subdirectories)
  • Custom Flyway Java migrations may be added anywhere under the org.dspace.storage.rdbms.migration package (e.g. [src]/dspace-api/src/main/java/org/dspace/storage/rdbms/migration/ or subdirectories)
  • Additionally, for backwards support, custom SQL migrations may also be placed in the [dspace]/etc/[db-type]/ folder (e.g. [dspace]/etc/postgres/ for a PostgreSQL specific migration script)

Adding Flyway migrations to any of the above location will cause Flyway to auto-discover the migration. It will be run in the order in which it is named. Our DSpace Flyway script naming convention follows Flyway best practices and is as follows:

  •  SQL script names: V[version]_[date]__[description].sql
    • E.g. V5.0_2014.09.26__DS-1582_Metadata_For_All_Objects.sql is a SQL migration script created for DSpace 5.x (V5.0) on Sept 26, 2014 (2014_09_24).  Its purpose was to fulfill the needs of ticket DS-1582, which was to migrate the database in order to support adding metadata on all objects.
    • More examples can be found under the org.dspace.storage.rdbms.sqlmigration package
  • Java migration script naming convention: V[version]_[date]__[description].java
    • E.g. V5_0_2014_09_25__DS_1582_Metadata_For_All_Objects_drop_constraint.java is a Java migration created for DSpace 5.x (V5_0) on Sept 25, 2014 (2014_09_25).  Its purpose was to fulfill the needs of ticket DS-1582, specifically to drop a few constraints.
    • More examples can be found under the org.dspace.storage.rdbms.migration package
  • Flyway will execute migrations in order, based on their Version and Date.  So, V1.x (or V1_x) scripts are executed first, followed by V3.0 (or V3_0), etc.  If two migrations have the same version number, the date is used to determine ordering (earlier dates are run first)

...

db.password

...

  • .

Bitstream Store

DSpace offers two means for storing content.

...

The BitstreamStorageManager provides low-level access to bitstreams stored in the system. In general, it should not be used directly; instead, use the Bitstream object in the content management API since that encapsulated encapsulates authorization and other metadata to do with a bitstream that are not maintained by the BitstreamStorageManager.

...