All Versions
- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
...
Watch the DSpace Discovery introduction video
Info |
---|
Since DSpace 46.0, Discovery is the default the only out-of-the-box Search and Browse infrastructure for both XMLUI and JSPUIprovided in DSpace. |
From the user perspective, faceted search (also called faceted navigation, guided navigation, or parametric search) breaks up search results into multiple categories, typically showing counts for each, and allows the user to "drill down" or further restrict their search results based on those facets.
...
This is a classic "tag cloud" facet in a DSpace repository.
The legacy search engine (based on Apache Lucene) and legacy Browse system (based on database tables) have been removed from DSpace 6.0 or above. Instead, DSpace now only uses Discovery for all Search/Browse capabilities.
In addition, to support the new Configuration options, all of the Discovery configurations in discovery.cfg
have been prefixed with "discovery.
" (see configuration below).
The new JSPUI-only tag cloud facet feature is disabled by default. In order to enable it, you will need to set up the corresponding processor that the PluginManager will load to actually perform the tag cloud query on the relevant pages. This is configured in the dspace.cfg configuration file using the following properties:
...
...
Because Discovery was adopted as the default infrastructure for search and browse in DSpace 4, no manual steps are required to enable Discovery. If you want to enable Discovery on older versions of DSpace, please refer to the DSpace documentation for that particular version.
If you have upgraded from an older version of DSpace, your database may still include outdated "bi_*" tables (where "bi" = "browse index"). When Discovery is enabled, these tables are no longer necessary, as Discovery takes over this browse index function.
To clean up all these old "bi_*" tables, simply run:
Code Block |
---|
[dspace]/bin/dspace index-db-browse -f -d |
The configuration for discovery is located in 2 separate files.
discovery.cfg
file located in the [dspace-install-dir]/config/modules directory
.discovery.xml
file is located in [dspace-install-dir]/config/spring/api/
directory.config/modules/discovery.cfg
)The discovery.cfg
file is located in the [dspace-install-dir]/config/modules
directory and contains following properties:
The configuration for discovery is located in 2 separate files.
discovery.cfg
file located in the [dspace-install-dir]/config/modules directory
.discovery.xml
file is located in [dspace-install-dir]/config/spring/api/
directory.config/modules/discovery.cfg
)The discovery.cfg
file is located in the [dspace]/config/modules
directory and contains following properties. Any of these properties may be overridden in your local.cfg
(see Configuration Reference):
Property: | discovery.search.server | ||
Example Value: |
| ||
Informational Note: | Discovery relies on | ||
Property: | search.server | ||
Example Value: |
| ||
Informational Note: | Discovery relies on a Solr index for storage and retrieval of its information. This parameter determines the location of the Solr index. If you are uncertain whether this property is set correctly, you can use a commandline tool like "wget" to perform a query against the Solr index (and ensure Solr responds). For example, the below query searches the Solr index for "test" and returns the response on standard out:
| ||
Property: | discovery.index. | Property: | index.authority.ignore[.field] |
Example Value: |
| ||
Informational Note: | By default, Discovery will use the authority information in the metadata to disambiguate homonyms. Setting this property to false will make the indexing process the same as the metadata doesn't include authority information. The configuration can be different on a field (<schema>.<element>.<qualifier>) basis, the property without field set the default value. | ||
Property: | discovery.index.authority.ignore-prefered[.field] | ||
Example Value: |
| ||
Informational Note: | By default, Discovery will use the authority information in the metadata to query the authority for the prefered preferred label. Setting this property to false will make the indexing process the same as the metadata doesn't include authority information (i.e. the prefered preferred form is the one recorded in the metadata value). The configuration can be different on a field (<schema>.<element>.<qualifier>) basis, the property without field set the default value. If the authority is a remote service, disabling this feature can greatly improve performance. | ||
Property: | discovery.index.authority.ignore-variants[.field] | ||
Example Value: |
| ||
Informational Note: | By default, Discovery will use the authority information in the metadata to query the authority for variants. Setting this property to false will make the indexing process the same, as the metadata doesn't include authority information. The configuration can be different on a per-field (<schema>.<element>.<qualifier>) basis, the property without field set the default value. If authority is a remote service, disabling this feature can greatly improve performance. |
...
The discovery.xml
file is located in the [dspace-install-dir]/config/spring/api
directory.
...
The hit highlighting configuration element contains all settings necessary to display search snippets & enable hit highlighting.
Warning |
---|
This paragraph section only applies to XMLUI. JSPUI does not currently support "highlighting & search snippets". |
Info |
---|
The hit highlighting configuration element contains all settings necessary to display search snippets & enable hit highlighting.
Warning |
---|
Changes made to the configuration will not automatically be displayed in the user interface. By default, only the following fields are displayed: dc.title, dc.contributor.author, dc.creator, dc.contributor, dc.date.issued, dc.publisher, dc.description.abstract and fulltext. If additional fields are required, look for the "itemSummaryList" template. |
Below is an example configuration of hit highlighting.
| ||
You can disable hit highlighting / search snippets by commenting out the entire PLEASE BE AWARE there are two sections where this <property> definition exists. You should comment out both. One is under the Alternatively, you may also choose to tweak which fields are shown in hit highlighting, or modify the number of matching words shown (snippets) and/or number of characters shown around the matching word (maxSize). For this change to take effect in the User Interface, you will need to restart Tomcat. |
Note |
---|
Changes made to the configuration will not automatically be displayed in the user interface. By default, only the following fields are displayed: dc.title, dc.contributor.author, dc.creator, dc.contributor, dc.date.issued, dc.publisher, dc.description.abstract and fulltext. If additional fields are required, look for the "itemSummaryList" template. |
Below is an example configuration of hit highlighting.
Code Block | ||
---|---|---|
| ||
Code Block | ||
| ||
<property name="hitHighlightingConfiguration"> <bean class="org.dspace.discovery.configuration.DiscoveryHitHighlightingConfiguration"> <property name="metadataFields"> <list> <bean class="org.dspace.discovery.configuration.DiscoveryHitHighlightFieldConfiguration"> <property name="field" value="dc.title"/> <property name="snippets" value="5"/> </bean> <bean class="org.dspace.discovery.configuration.DiscoveryHitHighlightFieldConfiguration"> <property name="field" value="dc.contributor.author"/> <property name="snippets" value="5"/> </bean> <bean class="org.dspace.discovery.configuration.DiscoveryHitHighlightFieldConfiguration"> <property name="field" value="dc.subject"/> <property name="snippets" value="5"/> </bean> <bean class="org.dspace.discovery.configuration.DiscoveryHitHighlightFieldConfiguration"> <property name="field" value="dc.description.abstract"/> <property name="maxSize" value="250"/> <!-- Max number of characters to display around the matching word (Warning setting to 0 returns entire field) --> <property name="snippetsmaxSize" value="2250"/> </bean> <!-- Max number of snippets (matching words) to <bean class="org.dspace.discovery.configuration.DiscoveryHitHighlightFieldConfiguration"show --> <property name="fieldsnippets" value="fulltext2"/> <property name="maxSize" value="250"/></bean> <property name="snippets" value="2"/> <bean class="org.dspace.discovery.configuration.DiscoveryHitHighlightFieldConfiguration"> <!-- Displays snippets from indexed full text of document (for supported formats) --> </bean><property name="field" value="fulltext"/> </list> </property> </bean> </property> |
The property name & the bean class are mandatory. The property field names are:
!-- Max number of characters to display around the matching word (Warning setting to 0 returns entire field) -->
<property name="maxSize" value="250"/>
<!-- Max number of snippets (matching words) to show -->
<property name="snippets" value="2"/>
</bean>
</list>
</property>
</bean>
</property> |
The property name & the bean class are mandatory. The property field names are:
*
if all the metadata fields should be highlighted).*
if all the metadata fields should be highlighted)....
Code Block | ||
---|---|---|
| ||
<bean id="tagCloudConfiguration" class="org.dspace.discovery.configuration.TagCloudConfiguration"> <!-- Should display the score of each tag next to it? Default: false --> <property name="displayScore" value="true"/> <!-- Should display the tag as center aligned in the page or left aligned? Possible values: true | false. Default: true --> <property name="shouldCenter" value="true"/> <!-- How many tags will be shown. Value -1 means all of them. Default: -1 --> <property name="totalTags" value="-1"/> <!-- The letter case of the tags. Possible values: Case.LOWER | Case.UPPER | Case.CAPITALIZATION | Case.PRESERVE_CASE | Case.CASE_SENSITIVE Default: Case.PRESERVE_CASE --> <property name="cloudCase" value="Case.PRESERVE_CASE"/> <!-- If the 3 CSS classes of the tag cloud should be independent of score (random=yes) or based on the score. Possible values: true | false . Default: true--> <property name="randomColors" value="true"/> <!-- The font size (in em) for the tag with the lowest score. Possible values: any decimal. Default: 1.1 --> <property name="fontFrom" value="1.1"/> <!-- The font size (in em) for the tag with the lowest score. Possible values: any decimal. Default: 3.2 --> <property name="fontTo" value="3.2"/> <!-- The score that tags with lower than that will not appear in the rag cloud. Possible values: any integer from 1 to infinity. Default: 0 --> <property name="cuttingLevel" value="0"/> <!-- The distance (in px) between the tags. Default: 5 --> <property name="marginRight" value="5"/> <!-- The ordering of the tags (based either on the name or the score of the tag) Possible values: Tag.NameComparatorAsc | Tag.NameComparatorDesc | Tag.ScoreComparatorAsc | Tag.ScoreComparatorDesc Default: Tag.ScoreComparatorDesc NameComparatorAsc --> <property name="ordering" value="Tag.NameComparatorAsc"/> Default: Tag.NameComparatorAsc --> <property name="ordering" value="Tag.NameComparatorAsc"/> </bean> |
When tagCloud is rendered there are some CSS classes that you can change in order to change the appearance of the tag cloud.
...
...
...
</bean> |
When tagCloud is rendered there are some CSS classes that you can change in order to change the appearance of the tag cloud.
Class | Note |
---|---|
tagcloud | General class for the whole tagcloud |
tagcloud_1 | Specific tag class for tag of type 1 (based on score) |
tagcloud_2 | Specific tag class for tag of type 2 (based on score) |
tagcloud_3 | Specific tag class for tag of type 3 (based on score) |
[dspace]/
config/spring/api/discovery.xml
to remove the following line from the defaultConfiguration
and homepageConfiguration
beans (in the sidebarFacets
property):
Code Block | ||
---|---|---|
| ||
<ref bean="searchFilterContentInOriginalBundle"/> |
Then restart your servlet container.
...
Command used: |
|
Java class: | org.dspace.discovery.IndexClient |
Arguments (short and long forms): | Description |
| called without any options, will update/clean an existing index |
| (re)build index, wiping out current one if it exists |
| clean existing index removing any documents that no longer exist in the db |
| if updating existing index, force each handle to be reindexed even if uptodate |
| print this help message |
-i <object handle> | Reindex an individual object (and any child objects). When run on an Item, it just reindexes that single Item. When run on a Collection, it reindexes the Collection itself and all Items in that Collection. When run on a Community, it reindexes the Community itself and all sub-Communities, contained Collections and contained Items. |
| optimize search core |
| remove an Item, Collection or Community from index based on its handle |
-s | Rebuild the spellchecker, can be combined with -b and -f. |
It is recommended to run maintenance on the Discovery Solr index occasionally (from crontab or your system's scheduler), to prevent your servlet container from running out of memory:
...
Discovery currently has its own messages.xml file, located at dspace/modules/-xmlui/src/main/resources/aspects/Discovery/i18n/messages.xml. Should you want to To add your own labels for new fields and facets , you in a Maven overlay, copy this file to dspace/modules/xmlui/src/main/resources/aspects/Discovery/i18n/messages.xml and modify this file. Alternatively, you may add them either to this file or to the main messages.xml file. Same goes for translations - it's encouraged to submit a single messages_XX.xml file including messages from all the separate messages.xml files in DSpace.
Advanced search related keys (change "author" to desired field)
Filter name | xmlui.ArtifactBrowser.SimpleSearch.filter.author |
Facet heading | xmlui.ArtifactBrowser.AdvancedSearch.type_author |
"Filter by" page heading | xmlui.Discovery.AbstractSearch.type_author |