Modeshape's federation overview provides more background on how federation works and the underlying concepts.
Note: The term "projection" is sometimes used as a synonym for the "federation" feature.
Filesystem Federation
Filesystem federation maps a node in the repository to a directory on disk. This allows files on disk to be served and updated by Fedora 4 as though they were in the repository. Filesystem federation avoids having to transfer files using HTTP – and with larger file sizes (or with larger numbers of files being processed), this can improve performance significantly. If you are ingesting a large number of multi-gigabyte files, we recommend you consider filesystem federation.
Another use for filesystem federation is interoperability with another system. If you have files on disk managed by another application or workflow, you can use filesystem federation to serve them with Fedora 4 without having to ingest them using the REST API or create another copy of the files.
Configuration
An example filesystem federation configuration to include in your Modeshape repository.json :
"externalSources" : { "federated-directory" : { "classname" : "org.fcrepo.connector.file.FedoraFileSystemConnector", "directoryPath" : "/path/to/files", "projections" : [ "default:/federated => /" ], "contentBasedSha1" : "false", "readonly" : true, "extraPropertiesStorage" : "none" } },
directoryPath
- base directory for all files shared with the repositoryprojections -
lists one or more mappings from the repository to the filesystem. The format is "{workspace}:{repository path} => {path relative todirectoryPath
}". See Multiple Directories below for how to handle multiple mappings.contentBasedSha1 -
controls how internal identifiers are computed for files. By default (contentBaseSha1
= true), Modeshape computes the SHA-1 checksum of a file's content every time the file is accessed. For small files this creates a modest overhead. For large files, however, this dramatically reduces performance, since generating the checksum can take several seconds per gigabyte of data. For this reason, we recommend settingcontentBasedSha1
to false when serving files larger than 100MBreadonly -
controls whether the contents of the filesbase directory for all files shared with the repository are read-onlyextraPropertiesStorage
- sets the format for storing "extra" properties (properties that can't be set using filesystem attributes). Recommended values are "json" for the current JSON properties format, or "none" for disabling extra properties
Modeshape's FileSystemConnector documentation and configuration provide additional information about configuring the filesystem connector.
Multiple Directories
If you want to map multiple directories, the first entry in the projections array should map the parent directory (i.e, the directory in directoryPath
). Subsequent entries can map subdirectories to other repository paths. For example, if you have a directory /pub/
that contains two directories (/pub/project1/
and /pub/project2/
) which you want to map the project1
and project2
directories to the top level of the repository:
"externalSources" : { "federated-1" : { "classname" : "org.fcrepo.connector.file.FedoraFileSystemConnector", "directoryPath" : "/pub", "projections" : [ "default:/pub => /", "default:/project1 => /project1", "default:/project2 => /project2" ], "contentBasedSha1" : "false", "readonly" : true, "extraPropertiesStorage" : "none" } },
This configuration would provide the following mappings:
Repository URL | Filesystem Path |
---|---|
http://localhost:8080/rest/pub/ | /pub/ |
http://localhost:8080/rest/project1/ | /pub/project1/ |
http://localhost:8080/rest/project2/ | /pub/project2/ |
Other Connectors
In addition to the filesystem connector, Modeshape includes several other connectors, for federating content from Git, CMIS repositories, and relational databases.
Custom connectors can also be developed to support other system. The filesystem connector is a good reference implementation, particularly for file-based resources.