All Versions
- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
...
Each time a page or file gets requested, this request is being logged. The logging happens at the server side, and doesn't require a javascript like Google Analytics does, to provide usage data.
Definition of which fields are to be stored happens in the file dspace/solr/statistics/conf/schema.xml.
The fields, stored in a usage event by default are:
DSpace 1.6 and newer
After the introduction of the SOLR Statistics logging in DSpace 1.6, every pageview and file download is logged in a dedicated SOLR statistics core.
DSpace 3.0 and newer
In addition to the already existing logging of pageviews and downloads, DSpace 3.0 now also logs search queries users enter in the DSpace search dialog and workflow events.
Warning | ||
---|---|---|
| ||
Due to the very recent addition of Discovery for search & faceted browsing in JSPUI, these search queries are not yet logged. Regular (non-discovery) search queries are being logged in JSP UI. |
Warning | ||
---|---|---|
| ||
Only workflow events, initiated and executed by a physical user are being logged. Automated workflow steps or ingest procedures are currently not being logged by the workflow events logger. |
The logging happens at the server side, and doesn't require a javascript like Google Analytics does, to provide usage data. Definition of which fields are to be stored happens in the file dspace/solr/statistics/conf/schema.xml.
Although they are stored in the same index, the stored fields for views, search queries and workflow events are different. A new field, statistics_type determines which kind of a usage event you are dealing with. The three possible values for this field are view, search and workflow.
Code Block | ||
---|---|---|
| ||
<field name="statistics_type" type="string | ||
Code Block | ||
<field name="type" type="integer" indexed="true" stored="true" required="true" /> <field name="id" type="integer" indexed="true" stored="true" required="true" /> <field name="ip" type="string" indexed="true" stored="true" required="false" /> <field name="time" type="date" indexed="true" stored="true" required="true" /> <field name="epersonid" type="integer" indexed="true" stored="true" required="falsetrue" /> |
Code Block | ||
---|---|---|
| ||
<field <field name="continenttype" type="stringinteger" indexed="true" stored="true" required="falsetrue" /> <field name="countryid" type="stringinteger" indexed="true" stored="true" required="falsetrue" /> <field name="countryCodeip" type="string" indexed="true" stored="true" required="false" /> <field name="citytime" type="stringdate" indexed="true" stored="true" required="falsetrue" /> <field name="longitudeepersonid" type="floatinteger" indexed="true" stored="true" required="false" /> <field name="latitudecontinent" type="floatstring" indexed="true" stored="true" required="false"/> <field name="owningCommcountry" type="integerstring" indexed="true" stored="true" required="false" multiValued="true"/> <field name="owningCollcountryCode" type="integerstring" indexed="true" stored="true" required="false" multiValued="true"/> <field name="owningItemcity" type="integerstring" indexed="true" stored="true" required="false"/> <field name=" multiValuedlongitude" type="float" indexed="true" stored="true" required="false"/> <field name="dnslatitude" type="stringfloat" indexed="true" stored="true" required="false"/> <field name="userAgentowningComm" type="stringinteger" indexed="true" stored="true" required="false" multiValued="true"/> <field name="isBotowningColl" type="booleaninteger" indexed="true" stored="true" required="false" multiValued="true"/> <field name="bundleNameowningItem" type="stringinteger" indexed="true" stored="true" required="false" multiValued="true" /> |
The combination of type and id determine which resource (either community, collection, item page or file download) has been requested.
In the XMLUI, statistics can be accessed from the lower end of the navigation menu. In the JSPUI, a view statistics button appears on the bottom of pages for which statistics are available.
" required="false" multiValued="true"/>
<field name="dns" type="string" indexed="true" stored="true" required="false"/>
<field name="userAgent" type="string" indexed="true" stored="true" required="false"/>
<field name="isBot" type="boolean" indexed="true" stored="true" required="false"/>
<field name="referrer" type="string" indexed="true" stored="true" required="false"/>
<field name="uid" type="uuid" indexed="true" stored="true" default="NEW" />
<field name="statistics_type" type="string" indexed="true" stored="true" required="true" default="view" /> |
The combination of type and id determines which resource (either community, collection, item page or file download) has been requested.
Code Block | ||
---|---|---|
| ||
<field name="bundleName" type="string" indexed="true" stored="true" required="false" multiValued="true" /> |
Code Block | ||
---|---|---|
| ||
<field name="query" type="string" indexed="true" stored="true" required="false" multiValued="true"/>
<field name="scopeType" type="integer" indexed="true" stored="true" required="false" />
<field name="scopeId" type="integer" indexed="true" stored="true" required="false" />
<field name="rpp" type="integer" indexed="true" stored="true" required="false" />
<field name="sortBy" type="string" indexed="true" stored="true" required="false" />
<field name="sortOrder" type="string" indexed="true" stored="true" required="false" />
<field name="page" type="integer" indexed="true" stored="true" required="false" /> |
Code Block | ||
---|---|---|
| ||
<field name="workflowStep" type="string" indexed="true" stored="true" required="false" multiValued="true"/>
<field name="previousWorkflowStep" type="string" indexed="true" stored="true" required="false" multiValued="true"/>
<field name="owner" type="string" indexed="true" stored="true" required="false" multiValued="true"/>
<field name="submitter" type="integer" indexed="true" stored="true" required="false" />
<field name="actor" type="integer" indexed="true" stored="true" required="false" />
<field name="workflowItemId" type="integer" indexed="true" stored="true" required="false" /> |
In the XMLUI, pageview and download statistics can be accessed from the lower end of the navigation menu. In the JSPUI, a view statistics button appears on the bottom of pages for which statistics are available.
If you are not seeing these links or buttons, it's likely that they are only enabled for administrators in your installation. Change the configuration parameter "authorization.admin.usage" in usage-statistics.cfg to false in order to make statistics visible for all repository visitors.
Starting from the repository homepage, the statistics page displays the top 10 most popular items of the entire repository.
The following statistics are available for the community home pages:
The following statistics are available for the collection home pages:
The following statistics are available for the item home pages:
In the XMLUI, search query statistics can be accessed from the lower end of the navigation menu.
If you are not seeing the link labelled "search statistics", it is If you are not seeing these links or buttons, it's likely that they are only enabled for administrators in your installation. Change the configuration parameter "statistics.item.authorization.admin.search" in usage-statistics.cfg to false in order to make statistics visible for all repository visitors.
Starting from the repository homepage, the statistics page displays the top 10 most popular items of the entire repository.
The following statistics are available for the community home pages:
The following statistics are available for the collection home pages:
The following statistics are available for the item home pages:
...
repository visitors.
The dropdown on top of the page allows you to modify the time frame for the displayed statistics.
The Pageviews/Search column tracks the amount of pages visited after a particular search term. Therefor a zero in this column means that after executing a search for a specific keyword, not a single user has clicked a single result in the list.
If you are using Discovery, note that clicking the facets also counts as a search, because clicking a facet sends a search query to the Discovery index.
In the XMLUI, search query statistics can be accessed from the lower end of the navigation menu.
If you are not seeing the link labelled "Workflow statistics", it is likely that they are only enabled for administrators in your installation. Change the configuration parameter "authorization.admin.workflow" in usage-statistics.cfg to false in order to make statistics visible for all repository visitors.
The dropdown on top of the page allows you to modify the time frame for the displayed statistics.
The DSpace Statistics Implementation is a Client/Server architecture based on Solr for collecting usage events in the JSPUI and XMLUI user interface applications of DSpace. Solr runs as a separate webapplication and an instance of Apache Http Client is utilized to allow parallel requests to log statistics events into this Solr instance.
...
In the {dspace.dir}/config/modules/solr-statistics.cfg file review the following fields to make sure they are uncommented:
Property: | server | ||
Example ValueValues: | server = http://127.0.0.1/solr.1/solr/statistics | ||
Informational Note: | Is used by the SolrLogger Client class to connect to the Solr server over http and perform updates and queries. In most cases, this can (and should) be set to localhost (or 127.0.0.1).
Assuming you get an HTTP 200 OK response, then you should set | ||
Property: | query.filter.bundles | ||
Example | query.filter.bundles=ORIGINAL | ||
Informational | A comma seperated list that contains the bundles for which the file statistics will be displayed. | ||
Property: | solr.statistics.query.filter.spiderIp | ||
Example Value: | solr.statistics.query.filter.spiderIp = false | ||
Informational Note: | If true, statistics queries will filter out spider IPs -- use with caution, as this often results in extremely long query strings. | ||
Property: | solr.statistics.query.filter.isBot | ||
Example Value: | solr.statistics.query.filter.isBot = true | ||
Informational Note: | If true, statistics queries will filter out events flagged with the "isBot" field. This is the recommended method of filtering spiders from statistics. | ||
Property: | spiderips.urls | ||
Example Value: | spiderips.urls =
| ||
Informational Note: | List of URLs to download spiders files into [dspace]/config/spiders. These files contain lists of known spider IPs and are utilized by the SolrLogger to flag usage events with an "isBot" field, or ignore them entirely.
from your [dspace]/bin directory |
...
Property: | dbfile |
Example Value: | dbfile = ${dspace.dir}/config/GeoLiteCity.dat |
Informational Note: | The following referes to the GeoLiteCity database file utilized by the LocationUtils to calculate the location of client requests based on IP address. During the Ant build process (both fresh_install and update) this file will be downloaded from http://www.maxmind.com/app/geolitecity if a new version has been published or it is absent from your [dspace]/config directory. |
Property: | resolver.timeout |
Example Value: | resolver.timeout = 200 |
Informational Note: | Timeout in milliseconds for DNS resolution of origin hosts/IPs. Setting this value too high may result in solr exhausting your connection pool. |
Property: | useProxies |
Example Value: | useProxies = true |
Informational Note: | Will cause Statistics logging to look for X-Forward URI to detect clients IP that have accessed it through a Proxy service (e.g. the Apache mod_proxy). Allows detection of client IP when accessing DSpace. [Note: This setting is found in the DSpace Logging section of dspace.cfg] |
Property: | authorization.admin.usage |
Example Value: | authorization.admin.usage = true |
Informational Note: | When set to true, only general administrators, collection and community administrators are able to access the pageview and download statistics from the web user interface. As a result, the links to access statistics are hidden for non logged-in admin users. Setting this property to "false" will display the links to access statistics to anyone, making them publicly available. |
Property: | authorization.admin.search |
Example Value: | authorization.admin.search = true |
Informational Note: | When set to true, only system, collection or community administrators are able to access statistics on searchessearch queries. |
Property: | authorization.admin.workflow |
Example Value: | authorization.admin.workflow = true |
Informational Note: | When set to true, only system, collection or community administrators are able to access statistics on workflow events. |
Property: | logBots |
Example Value: | logBots = true |
Informational Note: | When this property is set to false, and IP is detected as a spider, the event is not logged. |
Example of rebuild and redeploy DSpace (only if you have configured your distribution in this manner)
First approach the traditional DSpace build process for updating
Code Block |
---|
cd [dspace-source]/dspace
mvn package
cd [dspace-source]/dspace/target/dspace-<version>-build.dir
ant -Dconfig=[dspace]/config/dspace.cfg update
cp -R [dspace]/webapps/* [TOMCAT]/webapps
|
The last step is only used if you do not follow the recommended practice of configuring [dspace]/webapps as location for webapps in your servlet container (Tomcat, Resin or Jetty). If you only need to build the statistics, and don't make any changes to other web applications, you can replace the copy step above with:
Code Block |
---|
cp -R dspace/webapps/solr TOMCAT/webapps
|
Again, only if you are not mounting [dspace]/webapps directly into your Tomcat, Resin or Jetty host (the recommended practice)
Restart your webapps (Tomcat/Jetty/Resin)
The following Dspace.cfg fields are only applicable to the older statistics solution.
Code Block |
---|
###### Statistical Report Configuration Settings ######
# should the stats be publicly available? should be set to false if you only
# want administrators to access the stats, or you do not intend to generate
# any
report.public = false
# directory where live reports are stored
report.dir = ${dspace.dir}/reports/
|
Older versions of DSpace featured static reports generated from the log files. They still persist in DSpace today but are completely independent from the SOLR based statistics.
The following configuration parameters applicable to these reports can be found in dspace.cfg.
Code Block |
---|
###### Statistical Report Configuration Settings ######
# should the stats be publicly available? should be set to false if you only
# want administrators to access the stats, or you do not intend to generate
# any
report.public = false
# directory where live reports are stored
report.dir = ${dspace.dir}/reports/
|
These fields are not used by the new 1.6 Statistics, but are only related to the Statistics from previous DSpace releases
Example of rebuild and redeploy DSpace (only if you have configured your distribution in this manner)
First approach the traditional DSpace build process for updating
Code Block |
---|
cd [dspace-source]/dspace
mvn package
cd [dspace-source]/dspace/target/dspace-<version>-build.dir
ant -Dconfig=[dspace]/config/dspace.cfg update
cp -R [dspace]/webapps/* [TOMCAT]/webapps
|
The last step is only used if you do not follow the recommended practice of configuring [dspace]/webapps as location for webapps in your servlet container (Tomcat, Resin or Jetty). If you only need to build the statistics, and don't make any changes to other web applications, you can replace the copy step above with:
Code Block |
---|
cp -R dspace/webapps/solr TOMCAT/webapps
|
Again, only if you are not mounting [dspace]/webapps directly into your Tomcat, Resin or Jetty host (the recommended practice)
Restart your webapps (Tomcat/Jetty/Resin)These fields are not used by the new 1.6 Statistics, but are only related to the Statistics from previous DSpace releases
...
The command line interface (CLI) scripts can be used to clean the usage database from additional spider traffic and other maintenance tasks. In DSpace 3.0, a script has been added to split up the monolithic SOLR core into individual cores each containing a year of statistics.
...
Modify line 178 205 in the StatisticsTransformer.java file
https://github.com/DSpace/DSpace/blob/dspace-1_83_x/dspace-xmlui/dspace-xmlui-api/src/main/java/org/dspace/app/xmlui/aspect/statistics/StatisticsTransformer.java#L178java#L205
-6 is the default setting, displaying the past 6 months of statistics. When reducing this to a smaller natural number, less months are being displayed.
...