You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Learning Outcomes

  • Understand the purpose of a repository
  • Learn what Fedora can do for you
  • Understand the key capabilities of the software

Course Outline

Introduction to Fedora 4

What is a Repository?

  • Secure software that stores, preserves, and provides access to digital materials
  • Supports complex semantic relationships between objects both within and outside the repository
  • Supports millions of objects, both large and small
  • Capable of interoperating with other applications and services

Fedora 4 Guiding Principles

  • Improved performance, enhanced vertical and horizontal scalability
  • More flexible storage options
  • Features to accommodate research data management
  • Better capabilities for participating in the world of linked open data
  • An improved platform for developers—one that is easier to work with and which will attract a larger core of developers.

Exposing and Connecting Content with Fedora 4

  • Flexible, extensible object modelling
  • Atomic objects with semantic connections using standard ontologies
  • RDF-based metadata using Linked Data
  • RESTful API with native RDF response format

Core Components

Durable Storage

One of the core components of Fedora 4 is its long-term storage and preservation capability. A number of features support this capability; they have been grouped here under the notion of Durable Storage.

Fixity

  • Over time, digital objects can become corrupt and unusable by suffering from bit rot and other digital preservation dangers
  • Fixity checks help preserve digital objects by verifying their integrity using techniques such as checksumming
  • On content ingest, Fedora can verify a user-provided checksum against the calculated value
  • A checksum can be recalculated and compared at any time via a REST-API request 

Backup and Restore

  • A full backup, including all Datastreams as well as a compact serialization of all objects, can be performed at any time
  • A full restore from a repository backup can be performed at any time

Export and Import

  • A specific Fedora object, its children objects, and associated Datastreams can be exported
    • The serialization of the Fedora object is more portable than the compact form found in the backup/restore feature
    • Exported objects are serialized in a standard JCR/XML format
  • An exported object or hierarchy of objects can be imported at any time

Versioning

Versions can be created across the entire repository or on particular API calls.

Policy-Driven Storage

Policies can route different types of files to different back-end storage locations on ingest.

Data Modelling

Content Models

Content can be modelled using Compact Node Definitions (CNDs).

A CND can define a number of properties and be assigned to objects as a mixin.

Any number of mixins can be assigned to an object; they are cumulative.

Linked Data

Compliance with LDP 1.0 spec.

Node properties are RDF triples.

Metadata can be represented as RDF triples.

Many possibilities for exposing, importing, sharing resources with other web applications.

User Interface

Administrative Console

Tour of the HTML administrative interface.

Internal Search

Simple property search and limited SPARQL endpoint.

External Components

Indexing

Triplestore

Similar to external search, an external triplestore can be plugged into Fedora 4. The same JMS message consumer relays repository events to the triplestore index (e.g. Fuseki, Sesame).

External Search

Repository events generate JMS messages, which can be consumed by a JMS message consumer and relayed to an external search index (e.g. Solr).

Authorization

Pluggable authorization framework.

Basic Authorization

Role-based authentication.

XACML Authorization

XACML enforcement implementation.

Performance

Transactions

Multiple actions can be bundled together into a single repository event (transaction).

Transactions offer performance benefits by cutting down on the number of times data is written to the repository filesystem (which tends to be the slowest action).

Clustering

 

 

 

  • No labels