All Versions


DSpace Documentation


Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Mentioned using an externally-provided GeoLite database

...

In the {dspace.dir}/config/modules/solr-statistics.cfg file review the following fields. These fields can be edited in place, or overridden in your own local.cfg config file (see Configuration Reference).

Property:

solr-statistics.server

Example Values:

solr-statistics.server = http://127.0.0.1/solr/statistics
solr-statistics.server = ${solr.server}/statistics

Informational Note:

Is used by the SolrLogger Client class to connect to the Solr server over http and perform updates and queries. In most cases, this can (and should) be set to localhost (or 127.0.0.1).

To determine the correct path, you can use a tool like wget to see where Solr is responding on your server. For example, you'd want to send a query to Solr like the following:

Code Block
wget http://127.0.0.1/solr/statistics/select?q=*:*

Assuming you get an HTTP 200 OK response, then you should set solr.log.server to the '/statistics' URL of 'http://127.0.0.1/solr/statistics' (essentially removing the "/select?q=:" query off the end of the responding URL.)

  

Property:

solr-statistics.query.filter.bundles

Example
Value:

solr-statistics.query.filter.bundles=ORIGINAL

Informational
Note:

A comma seperated list that contains the bundles for which the file statistics will be displayed.

  

Property:

solr-statistics.query.filter.spiderIp

Example Value:

solr-statistics.query.filter.spiderIp = false

Informational Note:

If true, statistics queries will filter out spider IPs -- use with caution, as this often results in extremely long query strings.

  

Property:

solr-statistics.query.filter.isBot

Example Value:

solr-statistics.query.filter.isBot = true

Informational Note:

If true, statistics queries will filter out events flagged with the "isBot" field. This is the recommended method of filtering spiders from statistics.

  

Property:

solr-statistics.spiderips.urls

Example Value:

solr-statistics.spiderips.urls =

Code Block
http://iplists.com/google.txt, \
http://iplists.com/inktomi.txt, \
http://iplists.com/lycos.txt, \
http://iplists.com/infoseek.txt, \
http://iplists.com/altavista.txt, \
http://iplists.com/excite.txt, \
http://iplists.com/misc.txt


Informational Note:

List of URLs to download spiders files into [dspace]/config/spiders. These files contain lists of known spider IPs and are utilized by the SolrLogger to flag usage events with an "isBot" field, or ignore them entirely.

The "stats-util" command can be used to force an update of spider files, regenerate "isBot" fields on indexed events, and delete spiders from the index. For usage, run:

Code Block
dspace stats-util -h

from your [dspace]/bin directory

 

In the {dspace.dir}/config/modules/usage-statistics.cfg file review the following fields. These fields can be edited in place, or overridden in your own local.cfg config file (see Configuration Reference).

Property:

usage-statistics.dbfile

Example Value:

usage-statistics.dbfile = ${dspace.dir}/config/GeoLiteCity.dat

Informational Note:

The following referes to the GeoLiteCity database file utilized by the LocationUtils to calculate the location of client requests based on IP address. During the Ant build process (both fresh_install and update) this file will be downloaded from http://www.maxmind.com/app/geolitecity if a new version has been published or it is absent from your [dspace]/config directory.

  

Property:

usage-statistics.resolver.timeout

Example Value:

usage-statistics.resolver.timeout = 200

Informational Note:

Timeout in milliseconds for DNS resolution of origin hosts/IPs. Setting this value too high may result in solr exhausting your connection pool.

  

Property:

useProxies  (Set in dspace.cfg)

Example Value:

useProxies = true

Informational Note:

Will cause Statistics logging to look for X-Forward URI to detect clients IP that have accessed it through a Proxy service (e.g. the Apache mod_proxy).  Allows detection of client IP when accessing DSpace. [Note: This setting is found in the DSpace Logging section of dspace.cfg]

  

Property:

usage-statistics.authorization.admin.usage

Example Value:

usage-statistics.authorization.admin.usage = true

Informational Note:

When set to true, only general administrators, collection and community administrators are able to access the pageview and download statistics from the web user interface. As a result, the links to access statistics are hidden for non logged-in admin users. Setting this property to "false" will display the links to access statistics to anyone, making them publicly available.

  

Property:

usage-statistics.authorization.admin.search

Example Value:

usage-statistics.authorization.admin.search = true

Informational Note:

When set to true, only system, collection or community administrators are able to access statistics on search queries. 
  

Property:

usage-statistics.authorization.admin.workflow

Example Value:

usage-statistics.authorization.admin.workflow = true

Informational Note:

 When set to true, only system, collection or community administrators are able to access statistics on workflow events.
  

Property:

usage-statistics.logBots

Example Value:

usage-statistics.logBots = true

Informational Note:

When this property is set to false, and IP is detected as a spider, the event is not logged.
When this property is set to true, the event will be logged with the "isBot" field set to true.
(see solr-statistics.query.filter.* for query filter options)

Pre-1.6 Statistics settings

...

Code Block
<lst name="facet_counts">
    <lst name="facet_fields">
        <lst name="epersonid">
            <int name="66">1167</int>

<int name="117">251</int>

<int name="52">42</int>

<int name="19">36</int>

<int name="88">20</int>

<int name="112">18</int>

<int name="110">9</int>

<int name="96">0</int>

</lst>
    </lst>
</lst>

...

Managing the GeoLite Database File

The GeoLite Database file (at [dspace]/config/GeoLiteCity.dat) is used by the Statistics engine to generate location/country based reports. (Note: If you are not using DSpace Statistics, this file is not needed.)

In most cases, this file is This file can be installed automatically when you run ant fresh_install. However, if the file cannot be downloaded & installed automatically, you may need to manually install it.

Alternatively, DSpace can be configured to use a GeoLite City database file that you already have and maintain by other means.  You can edit \[DSpace\]/config/local.cfg (or \[DSpace\]/config/modules/usage-statistics.cfg), changing the path usage-statistics.dbfile to point to a shared copy of the database.

As this file is also sometimes updated by MaxMind.com, you may also wish to update it on occasion.  As this is written, the database is updated monthly.

You have three options to install/update this file:

...