
Frequently, we talk about "the data model" in VIVO. But this is an over-simplification which can be useful at times, but misleading at other times. In fact, VIVO contains a matrix of data models and sub-models, graphs, datasets and other constructs.

It might be more accurate to talk about the union of these data models as "the knowlege base". However, the terminology of "the data model" is firmly entrenched.

In VIVO release 1.6, we are attempting to simplify this complex collection of models, and to produce a unified access layer. This is a work in progress. Regardless of how clean the design might eventually become, this will remain an area with complex requirements which cannot be satisfied by simplistic solutions.

Divisions in the knowledge base

Depending on what you want to do with the data, it can be useful to sub-divide it by one or more of the following criteria:

Types of statements

An RDF model is often divided into ABox (assertions) and TBox (terminology). In RDF, there is no technical distinction between TBox and ABox data. They are stored separately because they are used for different purposes. The combination of the two is informally called the Full model.

 Data typeExample data

"Terminological data"

Defines classes, properties, and relationships in your ontology.

a owl:Class ;
rdfs:subClassOf owl:Thing ;
rdfs:label "Person"@en .
a owl:DatatypeProperty ;
rdfs:subPropertyOf skos:prefLabel, foaf:name, rdfs:label ;
rdfs:domain foaf:Person ;
rdfs:label "preferred name"@en .

"Assertion data"

Enumerates the individual instances of your classes and describes them.

a foaf:Person ;
ex:preferredName "Toby Inkster" .

The TBox and the ABox together, treated as a single model.

For example, when you use the RDF tools to remove statements, you want them removed regardless of whether they are found in the TBox or the ABox.


Source of statements

An RDF model can also be divided into Assertions and Inferences. The combination of the two is informally called the Union.

Statement typeMeaningExample data
AssertionsStatements that you explicitly add to the model, either through setup, ingest, or editing.local:tobyink rdfs:type core:FacultyMember .
InferencesStatements that the semantic reasoner adds to the model, by reasoning about the assertions, or about other inferences.
local:tobyink rdfs:type foaf:Person .
local:tobyink rdfs:type foaf:Agent . 
local:tobyink rdfs:type owl:Thing . 

The combination of Assertions and Inferences.

For most purposes, this is the desired model. You want to know what statements are available, without regard to whether they were asserted or inferred.


"Content" vs. "Configuration"

We sometimes distinguish between the data that VIVO is serving (Content) and the data that VIVO itself uses (Configuration). The Content is available for display, for searching, for serving as Linked Open Data. The Configuration controls how the content is displayed, who can access the data, and what VIVO itself looks like.

Model typePurposeExamples
ConfigurationData about the VIVO application itself.

Application parameters

User Accounts

Display options

ContentThe payload - the data that VIVO is intended to distribute.

People data

Publications data

Grant data


Model scope

The knowledge base exists for as long as VIVO is running. However, subsets or facets of the knowledge base are often used to satisfy a particular HTTP request, or through the length of a VIVO session for a particular user. These subsets are created dynamically from the full knowledge base, used for as long as they are useful, and then discarded.

ScopePurposeExampleDiscarded when...


(Servlet Context)

Created for the life of VIVO.


 Never discarded.
SessionCreated for a particular logged-in userData that is filtered by what the user is permitted to view.When the user logs out, or the session times out.
RequestCreated for a single HTTP request

Data that is organized by the languages that are preferred by the browser.

When the individual request has been satisfied.

At present, the Session lifespan is almost never used. However, potential use cases do exist for it.

The Request lifespan is used extensively, since it provides a convenient way to manage database connections and minimize contention for resources.

Purpose vs. scope

It is tempting to think of the models of the Servlet Context as equivalent to the unfiltered models of the Request. They may even represent the very same data. However, they have different scope, which makes them very different in practice.

The unfiltered models in the Request go out of scope when the Request has been satisfied. The resources required by these models have short lifetimes and are very easily managed. The models of the Servlet Context never go out of scope until VIVO is shut down. It is difficult to reclaim resources such as database connections or processor memory from these models.

The Data Models

This is a summary of the data models:

The basic contentBase ABox, Base TBox, Inferred ABox, Inferred TBoxNamed graphs from the RDF Service (optionally with sub-graphs).
Views of the contentBase Full, Inferred Full, Union ABox, Union TBox, Union FullViews of the 4 basic content graphs in different combinations.
The configurationApplication Metadata, User Accounts, Display Model, Display TBox, DisplayDisplayNamed graphs from the application datasource.

Increasing complexity

The structure of the data models has grown as VIVO has developed. New models, new structures, and new means of accessing the data have been added as required by the growing code. The resulting data layer has grown more complex and more error-prone.

In release 1.5, VIVO added the RDFService interface, which has increased the flexibility of data sources, and promises to allow a more unified view of the knowledge base. However, the transition to RDFService is not complete, and so this adds another layer of complexity to the data issues.

Beyond the models

The RDF Service

The DAO Layer

OntModel Selectors

Model makers and Model sources

The ModelAccess class



Show how it represents all of these distinctions. Describe the scope searching and masking, wrt set and get.

Initializing the Models

When VIVO starts up, OntModel objects are created to represent the various data models. The configuration models are created from the datasource connection, usually to a MySQL database. The content models are created using the new RDFService layer. By default this also uses the datasource connection, but it can be configured to use any SPARQL endpoint for its data.

Some of the smaller models are "memory-mapped" for faster access. This means that they are loaded entirely into memory at startup. Any changes made to the memory image will be replicated in the original model.

The data in each model persists in the application datasource (usually a MySQL database), or in the RDFService. Also, data from disk files may be loaded into the models. This may occur:

depending on the particular model.

The "first time"

For purposes of initialization, VIVO is considered to be starting for the first time if the Application metadata model contains no statements, or if the RDFService detects that its SDB-based datastore has not been initialized.

Initializing Configuration models

Application metadata

Function: Describes the configuration of VIVO at this site. Many of the configuration options are obsolete.


Source: the application Datasource (MySQL database) (memory-mapped)

If this is the first startup, read the files in /WEB-INF/init-data (without subdirectories)

Also if this is the first startup, read the files in /WEB-INF/ontologies/user/applicationMetadata

User Accounts

Contains login credentials and assigned roles for VIVO users.


Source: the application Datasource (MySQL database) (memory-mapped)

If this model is empty, read the files in /WEB-INF/ontologies/auth (without subdirectories). Ordinarily there are no such files. This feature is useful when running acceptance tests.

The Display model

This is the ABox for the display model, and contains the RDF statements that define managed pages, custom short views, and other items.


Source: the application Datasource (MySQL database) (memory-mapped)

If this model is empty, read the files in /WEB-INF/ontologies/app/ (without subdirectories)

Every time, read the files in /WEB-INF/ontologies/app/loadedAtStartup

Display TBox

The TBox for the display model.


Source: the application Datasource (MySQL database) (memory-mapped)

Every time, read /WEB-INF/ontologies/app/menuload/displayTBOX.n3 (note that existing statements are not cleared, except through the GUI)



Source: the application Datasource (MySQL database) (memory-mapped)

Every time, read /WEB-INF/ontologies/app/menuload/displayDisplay.n3 (note that existing statements are not cleared, except through the GUI)

Initializing Content models

base ABox


Source: named graph from the RDFService

If first setup, read the files in /WEB-INF/ontologies/user/abox (without subdirectories)

Every time, read the files in /WEB-INF/filegraph/abox, and create named models in the RDFService. Add them as sub-models to the base ABox. If these files are changed or deleted, update the RDFService accordingly.

base TBox


Source: named graph from the RDFService (memory-mapped)

If first setup, read the files in /WEB-INF/ontologies/user/tbox (without subdirectories)

Every time, read the files in /WEB-INF/filegraph/tbox, and create named models in the RDFService. Add them as sub-models to the base TBox. If these files are changed or deleted, update the RDFService accordingly.

base Full

Source: a combination of base ABox and base TBox

inference ABox


Source: named graph from the RDFService

inference TBox


Source: named graph from the RDFService (memory-mapped)

inference Full

Source: a combination of inference ABox and inference TBox

union ABox

Source: a combination of base ABox and inference ABox

union TBox

Source: a combination of base TBox and inference TBox

union Full

Source: a combination of union ABox and union TBox


Transition from previous methods


What are we transitioning from? Check out VIVO-82.



prior to ModelAccess

using ModelAccess

User Accounts Model




ctx.setAttribute("userAccountsOntModel", model)













ctx.setAttribute("displayOntModel", model)

ModelContext.setDisplayModel(model, ctx)



req.setAttribute("displayOntModel", model)












ctx.setAttribute("jenaOntModel", model)

ModelAccess.on(ctx).setOntModel(ModelID.UNION_FULL, model)


req.setAttribute("jenaOntModel", model)

ModelAccess.on(req).setOntModel(ModelID.UNION_FULL, model)




Base Full Model







ModelContext.setBaseOntModel(model, ctx)



Inference Full Model





prior to ModelAccess

using ModelAccess



ModelContext.setOntModelSelector(model, ctx)




no mutator methods











Future directions?

What are we transitioning toward? From VIVO-82