Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Fix spelling mistakes

...

The way data is organized in DSpace is intended to reflect the structure of the organization using the DSpace system. Each DSpace site is divided into communities, which can be further divided into sub-communities reflecting the typical university structure of college, departementdepartment, research center, or laboratory.

...

Items are further subdivided into named bundles of bitstreams. Bitstreams are, as the name suggests, streams of bits, usually ordinary computer files. Bitstreams that are somehow closely related, for example HTML files and images that compose a single HTML document, are organised organized into bundles.

In practice, most items tend to have these named bundles:

...

Supported

The format is recognized, and the hosting institution is confident it can make bitstreams of this format useable usable in the future, using whatever combination of techniques (such as migration, emulation, etc.) is appropriate given the context of need.

Known

The format is recognized, and the hosting institution will promise to preserve the bitstream as-is, and allow it to be retrieved. The hosting institution will attempt to obtain enough information to enable the format to be upgraded to the 'supported' level.

Unsupported

The format is unrecognized, but the hosting institution will undertake to preserve the bitstream as-is and allow it to be retrieved.

...

Package ingesters and package disseminators are each a type of named plugin (see DSDOC:Plugin Manager), so it is easy to add new packagers specific to the needs of your site. You do not have to supply both an ingester and disseminator for each format; it is perfectly acceptable to just implement one of them.

Most packager plugins call upon DSDOC:Crosswalk Plugins to translate the metadata between DSpace's object model and the package format.

...

Crosswalk plugins are named plugins (see DSDOC:Plugin Manager), so it is easy to add new crosswalks. You do not have to supply both an ingester and disseminator for each format; it is perfectly acceptable to just implement one of them.

...

However, an application session can be assigned membership in a group without being identified as an E-Person. For example, some sites use this feature to identify users of a local network so they can read restricted materials not open to the whole world. Sessions originating from the local network are given membership in the "LocalUsers" group and gain the corresonding corresponding privileges.

Administrators can also use groups as "roles" to manage the granting of privileges more efficiently.

...

  • Assigns an accession date
  • Adds a "date.available" value to the Dublin Core metadata record of the item
  • Adds an issue date if none already present
  • Adds a provenance message (including bitstream checksums)
  • Assigns a Handle persistent identifier
  • Adds the item to the target collection, and adds appropriate authorization policies
  • Adds the new item to the search and browse indicesindex

Workflow Steps

A collection's workflow can have up to three steps. Each collection may have an associated e-person group for performing each step; if no group is associated with a certain step, that step is skipped. If a collection has no e-person groups associated with any step, submissions to that collection are installed straight into the main archive.

...

The reason for this apparently arbitrary design is that is was the simplist simplest case that covered the needs of the early adopter communities at MIT. The functionality of the workflow system will no doubt be extended in the future.

...

Similar to handles for DSpace items, bitstreams also have 'Persistent' identifiers. They are more volatile than Handles, since if the content is moved to a different server or organizaionorganization, they will no longer work (hence the quotes around 'persistent'). However, they are more easily persisted than the simple URLs based on database primary key previously used. This means that external systems can more reliably refer to specific bitstreams stored in a DSpace instance.

...

  • Web pages tend to consist of several files – one or more HTML files that contain references to each other, and stylesheets and image files that are referenced by the HTML files.
  • Web pages also link to or include content from other sites, often imperceptably imperceptibly to the end-user. Thus, in a few year's time, when someone views the preserved Web site, they will probably find that many links are now broken or refer to other sites than are now out of context.In fact, it may be unclear to an end-user when they are viewing content stored in DSpace and when they are seeing content included from another site, or have navigated to a page that is not stored in DSpace. This problem can manifest when a submitter uploads some HTML content. For example, the HTML document may include an image from an external Web site, or even their local hard drive. When the submitter views the HTML in DSpace, their browser is able to use the reference in the HTML to retrieve the appropriate image, and so to the submitter, the whole HTML document appears to have been deposited correctly. However, later on, when another user tries to view that HTML, their browser might not be able to retrieve the included image since it may have been removed from the external server. Hence the HTML will seem broken.
  • Often Web pages are produced dynamically by software running on the Web server, and represent the state of a changing database underneath it.
    Dealing with these issues is the topic of much active research. Currently, DSpace bites off a small, tractable chunk of this problem. DSpace can store and provide on-line browsing capability for self-contained, non-dynamic HTML documents. In practical terms, this means:
  • No dynamic content (CGI scripts and so forth)
  • All links to preserved content must be relative links, that do not refer to 'parents' above the 'root' of the HTML document/site:
    • diagram.gif is OK
    • image/foo.gif is OK
    • ../index.html is only OK in a file that is at least a directory deep in the HTML document/site hierarchy
    • /stylesheet.css is not OK (the link will break)
    • http://somedomain.com/content.html is not OK (the link will continue to link to the external site which may change or disappear)
  • Any 'absolute links' (e.g. http://somedomain.com/content.html) are stored 'as is', and will continue to link to the external content (as opposed to relative links, which will link to the copy of the content stored in DSpace.) Thus, over time, the content refered referred to by the absolute link may change or disappear.

...

DSpace supports the OpenURL protocol from SFX, in a rather simple fashion. If your institution has an SFX server, DSpace will display an OpenURL link on every item page, automatically using the Dublin Core metadata. Additionally, DSpace can respond to incoming OpenURLs. Presently it simply passes the information in the OpenURL to the search subsystem. A list of results is then displayed, which usually gives the relevant item (if it is in DSpace) at the top of the list.

Creative Commons Support

Dspace DSpace provides support for Creative Commons licenses to be attached to items in the repository. They represent an alternative to traditional copyright. To learn more about Creative Commons, visit their website. Support for the licenses is controlled by a site-wide configuration option, and since license selection involves redirection to the Creative Commons website, additional parameters may be configured to work with a proxy server. If the option is enabled, users may select a Creative Commons license during the submission process, or elect to skip Creative Commons licensing. If a selection is made a copy of the license text and RDF metadata is stored along with the item in the repository. There is also an indication - text and a Creative Commons icon - in the item display page of the web user interface when an item is licensed under Creative Commons.

...

Various statistical reports about the contents and use of your system can be automatically generated by the system. These are generated by analysing analyzing DSpace's log files. Statistics can be broken down monthly.

The report includes following sections

  • A customisable customizable general overview of activities in the archive, by default including:
    • Number of items archived
    • Number of bitstream views
    • Number of item page views
    • Number of collection page views
    • Number of community page views
    • Number of user logins
    • Number of searches performed
    • Number of license rejections
    • Number of OAI Requests
  • Customisable Customizable summary of archive contents
  • Broken-down list of item viewings
  • A full break-down of all performed actions
  • User logins
  • Most popular searches
  • Log Level Information
  • Processing information!stats_genrl_overview.png!
    The results of statistical analysis can be presented on a by-month and an in-total report, and are available via the user interface. The reports can also either be made public or restricted to administrator access only.

...

*File Downloads information is only displayed for item-level statistics. Note that downloads from separate bitstreams are also recorded and represented separatlyseparately. DSpace is able to capture and store File Download information, even when the bitstream was downloaded from a direct link on an external website.

...

This is a configurable framework that lets you define plug-in classes to control the choice of values for a given DSpace metadata fields. It also lets you configure fields to include "authority" values along with the textual metadata value. The chociechoice-control system includes a user interface in both the Configurable Submission UI and the Admin UI (edit Item pages) that assists the user in choosing metadata values.

...