Table of Contents


Overview

This page explores design issues for an efficient ORE ontology gem implementation based off the ActiveTriples framework.

Ontologies

The following is a list of all ontologies used by the Triple Examples.

Ontology NamePrefixURLDetailsComments
RDF
rdf
http://www.w3.org/1999/02/22-rdf-syntax-ns#
specification 
RDF Schemardfs
http://www.w3.org/2000/01/rdf-schema#
specification 
Dublin Coredc
http://purl.org/dc/elements/1.1/
specification 
OREore
http://www.openarchives.org/ore/terms/
specificationRepresents both ordered and unordered items using the Aggregation class.
IANAiana
http://www.iana.org/assignments/relation/
specificationORE uses this ontology for first, last, next, and prev predicates.
Friend of a Friendfoaf
http://xmlns.com/foaf/0.1
specificationUses ld4l-foaf_rdf gem.

 

ORE Triples Examples

@prefix ore:     <http://www.openarchives.org/ore/terms/> .
@prefix iana:    <http://www.iana.org/assignments/relation/> .
@prefix dc:      <http://purl.org/dc/elements/1.1/> .

<http://localhost:3000/individual/vc155> a ore:Aggregation ;
  ore:aggregates <http://da-rdf.library.cornell.edu/individual/b3652730> ;
  ore:aggregates <http://da-rdf.library.cornell.edu/individual/b3652234> ;
  ore:aggregates <http://da-rdf.library.cornell.edu/individual/b3652543> ;
  iana:first <http://localhost:3000/individual/vci162> ;
  iana:last <http://localhost:3000/individual/vci164> ;
  dc:title "My list" ;
  dc:description "This is my list of references." .
 
<http://localhost:3000/individual/vci162> a ore:Proxy ;
  ore:proxyFor <http://da-rdf.library.cornell.edu/individual/b3652730> ;
  ore:proxyIn <http://localhost:3000/individual/vc155> ;
  iana:next <http://localhost:3000/individual/vci163> .
 
<http://localhost:3000/individual/vci163> a ore:Proxy ;
  ore:proxyFor <http://da-rdf.library.cornell.edu/individual/b3652234> ;
  ore:proxyIn <http://localhost:3000/individual/vc155> ;
  iana:prev <http://localhost:3000/individual/vci162> ;
  iana:next <http://localhost:3000/individual/vci164> .

 <http://localhost:3000/individual/vci164> a ore:Proxy ;
  ore:proxyFor <http://da-rdf.library.cornell.edu/individual/b3652543> ;
  ore:proxyIn <http://localhost:3000/individual/vc155> ;
  iana:prev <http://localhost:3000/individual/vci163> . 

 


Data Structure Approaches

The ORE implementation in the triple store will maintain a doubly linked list using IANA ontologies first and last predicates on the list and next and prev on the proxy.  Once loaded into memory, the ORE triples will be converted into a List Header Info hash data structure with an array of items each holding a List Item Info hash data structure.  Testing was executed to analyze the efficiency of working with arrays of List Item Info hash data structures.

Limit testing for array in-memory data structure

Tests descriptions:

Environments for testing:

Target production system:

Test code:  https://gist.github.com/elrayle/f5f559f8c10243600dc6

 

Results:

Max Itemsarray_createarray_movelist_createlist_movelist_insertlist_appendlist_find_lastlist_find_randomComments
 LaptopDEV VMLaptopDEV VMLaptop
(16Gb)
DEV VM
(2Gb)
LaptopDEV VMLaptopDEV VMLaptopDEV VMLaptopDEV VMLaptopDEV VM 
1,0000000000000000000 
10,0000000000000000000 
100,0000000000000000000 
500,0000000120000000000 
1,000,0000000450000000000 
2,000,000000011130000000000 
4,000,000000035OOM0OOM0OOM0OOM1OOM0OOM 
8,000,0001100115 0 0 0 2 0  
10,000,0001100167 0 0 0 2 0  
20,000,0001200OOM OOM OOM OOM OOM OOM  
40,000,0004400             
80,000,0006800             
100,000,00081111             

* Time measured in seconds

* OOM - Out of Memory

 

Analysis:

Converting triples data into an array of items is constrained by memory and the execution time of retrieving data from the triple store and constructing the list.  A modest 2Gb of RAM will support holding all data up to approximately 100,000 items and maintain processing efficiency.  Implementations supporting a list of greater than 100,000 items or many concurrent lists of smaller item counts will likely require a more sophisticated approach to retrieving and storing items in-memory.

 

Alternate Approaches:

 


Use Cases:

Use Case: Fetch the first N items in the list and display to user; Fetch next N and display; Fetch prev N and display; Fetch page X and display

Scenario:

User is browsing the list starting from the beginning, displaying N items at a time with the ability to move forward and back through the display of N items.

Operations:
Analysis:

 

Use Case: Add additional information to ListItemInfo before display

Scenario:

Display information is not part of ORE triples.  For example, proxyFor is a URI to a book which has a DC.title, DC.description, etc.   Would like to be able to add this information to each ListItemInfo structure and pass that to the UI code for display.

Operations:
Analysis:

 

Use Case: Move position of an item in the list

Scenario:

User is browsing the list and drags an item to a new position in the list.

Operations:
Analysis:

 

Use Case: Add (append) item to end of list

Scenario:

User is browsing the catalog and decides to add one or more bibliographic references to the list.  Add items defaults to append to end of list.

Operations:
Analysis:

 

Use Case: Sort list items by value not stored in list.

Scenario:

User chooses to sort list items by a display value that is not stored as part of the triples associated with the ORE list triples.  For example, the list is aggregating bibliographic references.  The properties of the bibliographic references may be stored in a separate triplestore.  Example sort criteria are title, publication date,

Operations:
Analysis:

 


List Header Info Structure

The list header info will be a hash.  It holds information about the list as a whole and the currently loaded range of items as a sub-structure.  The values for the list information stored depends on the type of the loaded items sub-structure.  The loaded items sub-structure holds items as a group are in-memory using one of four possible grouping methods: doubly linked lists, doubly linked lists + order index, array, or hash with proxy URI as the key.  NOTE: Order index is global to the entire list.  The array and hash implementations may also have this global order index information stored with the item info.

 

  example value typesComments
keydescription of valuedoubly
linked
doubly
linked
+ order
arrayhash 
firstpointer to first item in listURI of first proxyURI of first proxyURI of first proxyURI of first proxy 
lastpointer to last item in listURI of last proxyURI of last proxyURI of last proxyURI of last proxy 
countcount of all items in listXlast.indexlast.indexlast.indexCount for full list is available for array and hash when order index is stored in triplestore.
first_loadedpointer to first item in loaded range of itemsitemitemarray[0]URI of
first loaded proxy
 
currentpointer to current itemitemitemint cposURI of
current proxy
  • points to first item on fetch
  • updated to next/prev item during traversal
  • points to last modified item when changes are made
last_loadedpointer to last item in loaded range of itemsitemitemarray[size of array]URI of
last loaded proxy
 
count_loadedcount of items in currently loaded rangeX=position of last
- position of first
size of arraysize of hash 
resume_next_tokenpointer to item after last_loadedURI of proxyURI of proxyURI of proxyURI of proxy 
resume_prev_tokenpointer to item just before first_loadedURI of proxyURI of proxyURI of proxyURI of proxy 
loaded_itemspointer to loaded itemsuse first_loadeduse first_loadedarrayhash 

 

List Item Info Structure

Each item's info is stored in a hash.  It holds information about the individual item.  The loaded items as a group are in-memory using one of four possible grouping methods: doubly linked lists, doubly linked lists + order index, array, or hash with proxy URI as the key.

  example value types 
keydescription of valuedoubly
linked
doubly
linked
+ order
arrayhashComments
uriURI of proxy for this itemURI of this proxyURI of this proxyURI of this proxyURI of this proxyThis is the hash key for loaded_items when the list is implemented as a hash.
prevpointer to previous item in listURI of prev proxyURI of prev proxy= p - 1URI of prev proxy 
nextpointer to next item in listURI of next proxyURI of next proxy= p + 1URI of next proxy 
prev_loadedpointer to previous item in loaded range of itemsprev itemprev item= pl - 1URI of prev proxy 
next_loadedpointer to next item in loaded range of itemsnext itemnext item= pl + 1URI of next proxy 
proxyForURI of list item being aggregatedURIURIURIURI 
proxyInURI of list aggregationURI of
aggregation
URI of
aggregation
URI of
aggregation
URI of
aggregation
 
proxyIn_loadedpointer to List Headerheaderheaderheaderheader 
position (p)position in full listXpppPosition in full list is available for array and hash when order index is stored in triplestore.
position_loaded (pl)position in loaded range of itemsXplplX