The Resource Index Search Service (RISearch) is a web service that exposes the contents of a repository's Resource Index guide for outside use. This document introduces the use of this service through a web browser interface, then describes how to access it programmatically.
When your Fedora server is running, the RISearch service will be available under
/fedora/risearch. For example:
The user interface consists of three tabs: Find Tuples, Find Triples, and Show Aliases. A detailed description of each of these tabs follows.
The "Find Tuples" tab shown below is used to run tuple queries against the resource index. A tuple query is one that returns a list of named values.
When you enter a query and click "Launch", a new browser window will display the results.
To get an idea of how it works, try the following iTQL query, which asks for information about all Service Definition objects in the repository:
In response, you should see something like this:
This is a list of comma-separated values, each row representing the URI and modified date of the objects that matched the query.
Above the query text box, you can alter several settings for a query. These settings are described below.
Indicates the query language to use. Currently, the options are SPARQL and iTQL (a full-featured RDF query language supported by Mulgara).
Indicates the desired response format. Valid options include:
- CSV – Comma-separated values
- Simple – A simple easy-to-read text format that shows datatype information, when present
- Sparql – The W3C standard query response
- TSV – Tab-separated values
- Count – A count of the item returned by the query
The maximum number of results to return. It is useful to set this low when testing queries.
- Force Distinct – Whether to force duplicate results to be dropped. Note: iTQL never returns duplicates.
- Fake Media-Types – Whether to send incorrect Content-Type HTTP response headers with the responses (to trick browsers into displaying the results instead of popping up a "Save As/Open With" window).
- Stream Immediately – Whether to stream the results right away (faster), or to save them to a temporary file before sending them to the client. The default behavior (to save the results before streaming) will give a more informative error message if a query fails.
The "Find Triples" tab shown below is used to run triple queries against the resource index. A triple query is one that returns a list of RDF statements (aka triples).
This tab works in much the same way as the "Find Tuples" tab, but supports different response formats and provides a means to convert tuple query results to triples. It also exposes another query language: SPO.
This is a very simple RDF query language, where queries consist of a specific subject (or an asterisk, indicating "any"), a specific predicate (or an asterisk), and a specific object (or an asterisk). The easiest way to learn SPO is by example:
A variety of RDF formats are supported:
- N-Triples – A subset of Notation 3 defined in the RDF Test Cases document
- Notation 3 – The original RDF text format, defined by Tim Berners-Lee in An RDF language for the Semantic Web
- RDF/XML – The "RDF/XML" format, defined in the RDF/XML Syntax Specification
- Turtle – A newer subset of Notation 3, defined in Dave Beckett's Turtle - Terse RDF Triple Language
- count – A count of the item returned by the query
Templates are used to convert tuple query results to triples. A template consists of one or more triple binding patterns that reference the binding variables in an iTQL query.
The easiest way to understand how this works is by example.
In this example, we'll show how to extract a subgraph from the resource index using iTQL. Enter the following query text:
This query by itself returns all relationships between data objects in a repository. The binding variables are $a, $r, and $b. Now enter the following in the template text box:
When you launch the query, you'll see a list of triples: the sub-graph of all object-to-object relationships in the repository. (If you don't see anything, you should ingest the demo objects which include some sample relationships).
Now try the following template instead. This demonstrates how to derive new statements from those in the resource index:
Running the query will now show two statements for every object-to-object relationship in the resource index graph.
Note: When using templates to transform tuples to triples, some duplicates may be returned. These can be avoided by checking "Force Distinct".
This tab shows the aliases that can be used in queries and what URI prefixes they map to.
Aliases are just shortcuts that help make queries easier to write. For example, in a query you can write
<fedora-model:state> instead of
The RISearch service can be programmatically accessed via HTTP GET or POST. To avoid character encoding issues, POST should always be used when the query is passed in by value and contains non-ASCII characters.
As with the user interface, it can be invoked to retrieve tuples or triples. The syntax is described below.
- Square brackets ( "[" and "]" ) indicate that the parameter is optional.
- As with all HTTP parameters, unsafe URI characters should be URI-escaped. For readability purposes, URI escaping is not shown below.
- The query and template parameters optionally take the value by reference – that is, a URL to a query or template can be given instead of the actual text.
- The flush parameter tells the resource index to ensure that any recently-added/modified/deleted triples are flushed to the triplestore before executing the query. This option can be desirable in certain scenarios, but for performance reasons, should be used sparingly when a process is making many API-M calls to Fedora in a short period of time: We have found that Mulgara generally achieves a much better overall update rate with large batches of triples.