Problem Summary

Solr statistics records created before DSpace 6.x are still indexed with their legacy id.

The following PR will migrate legacy statistics records to use DSpace 6x UUID values: https://github.com/DSpace/DSpace/pull/1774

Because this PR requires a SOLR schema change, we have deferred this fix until DSpace 7x.

Iterim Solution for DSpace 6x

DSpace 6.0 had some code in place that will search for either a UUID or a legacy id when displaying usage records. This logic was incomplete.

The following PR addresses that issue: https://github.com/DSpace/DSpace/pull/1782

Unfortunately, when statistics records are retrieved by facet query separate counts are returned for records indexed by legacyId and records indexed by UUID.

Item Page

In the following record note that only one bitstream is present.

Usage Statistics

The resulting usage information lists the item twice and the bitstream twice.  A modification has been made to annotate the legacy statistics records.

Item Statistics

Bitstream Statistics

A fix for this bug would require changes to the following code.  Much of this code is quite old and it appears to need significant refactoring.

https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/statistics/content/StatisticsDataVisits.java#L204-L399

I recommend that we create a new ticket to refactor this code in DSpace 7.x.  Perhaps the changes for DS-3602 should be merged in the meantime.

  • No labels