Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Dedocumented old, removed spider list file

...

Property:

server

Example Values:

server = http://127.0.0.1/solr/statistics
server = ${solr.server}/statistics

Informational Note:

Is used by the SolrLogger Client class to connect to the Solr server over http and perform updates and queries. In most cases, this can (and should) be set to localhost (or 127.0.0.1).

To determine the correct path, you can use a tool like wget to see where Solr is responding on your server. For example, you'd want to send a query to Solr like the following:

Code Block
wget http://127.0.0.1/solr/statistics/select?q=*:*

Assuming you get an HTTP 200 OK response, then you should set solr.log.server to the '/statistics' URL of 'http://127.0.0.1/solr/statistics' (essentially removing the "/select?q=:" query off the end of the responding URL.)

  

Property:

query.filter.bundles

Example
Value:

query.filter.bundles=ORIGINAL

Informational
Note:

A comma seperated list that contains the bundles for which the file statistics will be displayed.

  

Property:

solr.statistics.query.filter.spiderIp

Example Value:

solr.statistics.query.filter.spiderIp = false

Informational Note:

If true, statistics queries will filter out spider IPs -- use with caution, as this often results in extremely long query strings.

  

Property:

solr.statistics.query.filter.isBot

Example Value:

solr.statistics.query.filter.isBot = true

Informational Note:

If true, statistics queries will filter out events flagged with the "isBot" field. This is the recommended method of filtering spiders from statistics.

  

Property:

spiderips.urls

Example Value:

spiderips.urls =

Code Block
http://iplists.com/google.txt, \
http://iplists.com/inktomi.txt, \
http://iplists.com/lycos.txt, \
http://iplists.com/infoseek.txt, \
http://iplists.com/altavista.txt, \
http://iplists.com/excite.txt, \
http://iplists.com/misc.txt, \
http://iplists.com/non_engines.txt

Informational Note:

List of URLs to download spiders files into [dspace]/config/spiders. These files contain lists of known spider IPs and are utilized by the SolrLogger to flag usage events with an "isBot" field, or ignore them entirely.

The "stats-util" command can be used to force an update of spider files, regenerate "isBot" fields on indexed events, and delete spiders from the index. For usage, run:

Code Block
dspace stats-util -h

from your [dspace]/bin directory

...