Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

Excerpt

To support the differing needs for sophisticated, rich searching, fedora Fedora 4 comes with a standard mechanism and integration point for indexing content in an external service.  This could be a general search service such as Apache Solr .

Define Indexing Namespace and Mixin in CND of fcrepo4

External indexing relies upon the objects you wish to have indexed to have an indexing:indexable mixin property. This can be done using the REST interface or via the fedora-node-types.cnd

Definition using the REST interface

Go to http://localhost:8080/rest/fcr:namespaces and in the Register Namespace form add:

     Prefix: indexing

     Namespace: http://fedora.info/definitions/v4/indexing#

Go to http://localhost:8080/rest/fcr:nodetypes and in the Update CND form add:  

[indexing:indexable] mixin
- indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder

Definition via fedora-node-types.cnd

Make sure your node definitions contain the following:

Code Block
titlefcrepo-kernel/src/main/resources/fedora-node-types.cnd
<indexing = 'http://fedora.info/definitions/v4/indexing#'>

[indexing:indexable] mixin
- indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder

The standard configuration chain is as follows:

...

or a standalone triplestore such as Sesame or Fuseki.

 

Table of Contents

Install and configure

...

Configure fcrepo4 messaging

 fcrepo-webapp/src/main/resources/spring/jms.xml contains the bean definitions used by the repository for messaging.  Currently the DefaultMessageFactory is used to implement messaging:

Code Block
titlejms.xml
<bean class="org.fcrepo.jms.headers.DefaultMessageFactory"/>

...

standalone search applications

fcrepo-jms-indexer-pluggable currently supports the following triplestores:

...

Install and configure fcrepo-jms-indexer-pluggable

The fcrepo-jms-indexer-pluggable project includes software for a web service that sits between your Fedora 4 repository and an external search service.  As it's name implies, it's a generic framework that allows for easy extension for use integrating unanticipated or proprietary search services with the Fedora 4 repository.  There are proof-of-concept implementations for Jena Fuseki, Sesame and Apache Solr.

The following github page has detailed instructions as to how to set up fcrepo-jms-indexer-pluggable.  This standalone app listens to messages produced by fcrepo4 and invokes the search applications as configured:

https://github.com/futures/fcrepo-jms-indexer-pluggable

Load an LDPATH program

The following is an example of loading a LDPATH program called "custom".

...

Mark a node as indexable and and assign an appropriate indexing transformation

For a node to be indexed it must:

  1. have the rdf type "http://fedora.info/definitions/v4/indexing#indexable
  2. have the property http:/

...

  1. /fedora.info/definitions/v4/

...

  1. indexing#hasIndexingTransformation set to a registered index transformation
Tip
titleIndexing Transformations

A default indexing transformation exists that maps the appropriate properties to the field names "title", "uuid" and "id".  To meet your needs, you can write and register custom indexing transformations.

 

Create new

Note that for solr indexing the field name (such as id, title, and uuid) must match the fields that are defined in the solr schema.xml (see solr documentation: https://cwiki.apache.org/confluence/display/solr/Solr+Field+Types).  One recommended schema.xml is provided by hydra-jetty (https://github.com/projecthydra/hydra-jetty/blob/master/solr/development-core/conf/schema.xml) which has a robust set of default dynamic fields.

...

objects with indexing properties

For an object to be indexed it must have a rdf:type of indexing:indexable, and optionally a indexing:hasIndexingTransformation corresponding to an LDPATH program.

...