Contribute to the DSpace Development Fund

The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.


Many organisational DSpace users run reports on DSpace not from within DSpace itself, but from within an external reporting system (Business Intelligence, JasperReports, etc) which connect to the DSpace database via ODBC. Reports may be run on an external reporting system rather than DSpace to allow integrated reporting across a number of systems, to allow integration with other organisational systems, to exploit existing reporting skills, or for other reasons. Typically the connection is read-only, this allows both better security (since it limits the damage that an incompetent or malicious user can do) and better performance (since read-only accesses to tables typically require less locking or resource-use in most databases).

Note that all steps here assume that you have configured DSpace, the database and the reporting tool to use UTF-8.

Setting up External Reporting

There are typically four steps to connecting DSpace to an external reporting system. Creating a database user with appropriate rights, allowing database connections from the remote system, opening firewall ports and creating the connection string.

Creating the user

Postgres

code to create a new user
CREATE ROLE reporting LOGIN PASSWORD 'SECRET';
GRANT CONNECT ON DATABASE dspace to reporting;
GRANT USAGE ON SCHEMA public TO reporting;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO reporting

Oracle

TODO

(this may help: DSpace with Oracle DB test instance)

Enabling remote database connections

Postgres

The trick is to edit pg_hba.conf to access from the smallest possible range of IP addresses:

line to add to pg_hba.conf
# TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
host    dspace      reporting   10.123.0.0/64         md5

You also need to make sure the Postgres daemon is listening on an IP address on the Postgres server that can be connected to from wherever you want to run your SQL client. Make sure that your "localhost" address is also included if you run postgres on the same machine as DSpace. The '*' value will make Postgres listen on all local IP addresses:

edit listen_addresses in postgresql.conf
listen_addresses = '*'

Oracle

TODO

Opening firewall ports

In a complex environment, there are likely to be firewalls between the machine hosting the database for DSpace and the machine(s) hosting the reporting software. These firewalls need to be opened on port 5432 (assuming the default postgres port).

Creating the connection string

Most reporting systems use URL-like "connection strings" to connect to databases.

Postgres

connection string for postgres
jdbc:postgresql://HOSTNAME:5432/dspace

Oracle

connection string for oracle
jdbc:oracle:thin:@//HOSTNAME:5432/dspace


Creating reports

ODBC-based reporting tools work by running one or more SQL query against the underlying DSpace database. You can find some examples in Unable to locate Jira server for this macro. It may be due to Application Link configuration. which illustrate the kinds of SQL joins that you'll be looking at using. Note the use of item.in_archive = 't' and item.withdrawn = 'f' to check whether the item has been deleted or withdrawn respectively. A diagram of the DSpace database schema is included in the Storage Layer section of the DSpace documentation. The SQL itself is stored in database-specific subdirectories of https://github.com/DSpace/DSpace/tree/master/dspace/etc

Note that some data (such as access stats) aren't held in the SQL database but in solr. Access details for these can be found at Solr#AccessingSolr.

Reusing reports

Do you have some reports that you might be willing to share? if so please log into upload them.


This is based on on Unable to locate Jira server for this macro. It may be due to Application Link configuration. "support for connecting to institutional reporting tools"

  • No labels