Page History

...

However, there has been interest in providing at least some level of built-in search functionality to address basic discovery scenarios. To guide planning and development, please provide concrete use cases your repository applications have for search that are not well-served by external search options.

List Repository URIs Based on Last Modification Time

Kevin Ford - 6 Feb 2017 Feb 6

There should be a way to request a list of repository resources based on last modification time. The use case is as follows:

...

[2] https://wiki.duraspace.org/display/FEDORA4x/Setup+Camel+Message+Integrations

List Repository URIs Based on Path

Kevin Ford - 6 Feb 2017

There should be a way to request a list of resource URIs under a specific repository path, whether it is the root path or a sub-path. Use case follows:

One overcast day in Chicago, we wished to check the response of each and every resource under a specific container path to specific HTTP methods, one of which was 404. (As above, the weather had no bearing on our action. In fact, we can’t even be certain it was overcast, but it was definitely winter and odds are, therefore, it was overcast. But I digress.)

Not starting with a set of Fedora-reported URIs – that is, relying on the data copied to a triplestore, for example – would be a problem because we needed to know whether Fedora had knowledge of the URI before we tested it. We therefore had to start with Fedora.

Because this was a custom, one-off exercise, we quickly wrote a Python script for this purpose. Naturally, it required logic to ‘crawl’ the Fedora repository starting at a specific container path. This wasn’t difficult, but it required some special consideration to ensure efficient memory usage for what we knew would be a healthy list of URIs, a surprise recursion error (easily corrected, but still), and use of an RDF library to parse the RDF from Fedora (admittedly we could have achieved the same end by solely operating on a JSON serialization or even an XML serialization using JSON or XML parsing methods).

This strategy – crawling a repository starting from a specific path – has been implemented in Java at least twice [1, 2] and probably more times too. This seems like a basic enough desire that it seems reasonable to think someone has written Ruby code or PHP code to perform the same action.

Given that a list of URIs under a specific path is frequently needed for operational or administrative purposes such that multiple developers are (re)creating code for this purpose in a variety of languages, this would be a good indication of the value of such a feature. Implementing it directly in Fedora would also ultimately save the time of downstream developers.

[1] https://github.com/awoods/fcrepo-java-client-etc

[2] https://github.com/fcrepo4-exts/fcrepo-camel-toolbox/tree/master/fcrepo-reindexing

Page tree

Versions Compared

Old Version 3

New Version 4

Key

List Repository URIs Based on Last Modification Time