Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: JIRA Issue macro params updated with additional server info

...

  • Cons: 
    • several of the key features for 1.6 are in their first implementation state and not fully mature
      • e.g., multiple language support will only support entering content in multiple language for rdfs:labels, not for data properties or custom forms
      • the web service for adding and updating RDF via SPARQL update will likely want to be extended to support queries (currently reads are accomplished through linked data requests, by changing permissions on VIVO's embedded SPARQL query page, or by a separate Fuseki SPARQL endpoint ideally running on a replicated copy of the VIVO database
        • Jira
          serverDuraSpace JIRA
          serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
          keyVIVO-101
    • other features we consider strong candidates for a 2.0 release are not in 1.6 at all
      • updating the Jena libraries, which will require removing dependence on the Jena RDB triple store technolog, still currently used for user accounts and other internal application data
      • external concept linking to Library of Congress Subject Headings and the National Agriculture Library Thesaurus
      • being able to select people from another VIVO
      • addressing organization identifiers
      • other post-1.6 issues

  • Road map consideration – the 2.0 release would serve as an excellent driver for road map discussions defining goals, features priorities, and resource requirements with 2.0 a near enough milestone that division of tasks between 2.0 and post 2.0 could be effectively addressed

...

(tick) 

Jira
serverDuraSpace JIRA
serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
keyVIVO-87

As more data is added to VIVO, some profiles get very large, most commonly when a person accumulates hundreds of publications that much each be included in rendering the person's profile page. While we continue to look at ways to improve query speeds, if you need your VIVO to display all pages with more or less equivalent, sub-second rendering times, some form of page caching at the Apache level using a tool such as Squid is necessary. Apache is very fast at displaying static HTML pages, and Squid saves a copy of every page rendered in a cache from which the page can be rendered by Apache rather than generated once again by VIVO. The good news:

...

  • Brian Caruso has proposed adding a unit test for Solr that would build an index from a standard set of VIVO RDF, start Solr, and run standard searches. This would help prevent breaking existing functionality when addressing issues that have come up such as support for diacritics, stop words, and capital letters in the middle of names
    • (question) A unit test has been developed for another related project at Cornell and we hope to be able to port this to VIVO, but perhaps not for 1.6
    • Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-102
  • (warning) (not for 1.6) Developing repeatable tests of loading one or more large datasets into VIVO. The challenge here is that performance is highly installation dependent.  The most urgent problem at Cornell has been the intermittent loss of communication between the VIVO web server and the VIVO database server, which results in some threads of activity simply hanging and never returning.  As with many errors that are hard to reproduce, we have developed workarounds that divide large jobs into chunks of data that experience has shown can be removed or added without causing hiccups.
    • Joe M. has submitted a paper to the Conference on a data ingest method using a standard set of data.  This could conceivably be extended to serve as a set of tests, but is presently more geared toward helping people new to VIVO understand data ingest than testing VIVO under load

...

(thumbs up) 

Jira
serverDuraSpace JIRA
serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
keyVIVO-112

Anchor
SitePageMgmt
SitePageMgmt
Site and Page Management

  • Make the About page HTML content editable through admin interface – this relates to display model changes
    • (thumbs up) 
      Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-105
    • Note that the home page cannot be made editable through the admin interface as it is too complex
  • (tick) Offering improved options for content on the home page, including a set of SPARQL queries to highlight research areas, international focus, or most recent publications
    • Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-106
  • (tick) Offering additional individual page template options
    • Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-107
    • Document the screen capture configuration as part of this
      • could put the service-specific aspects in a sub-template that gets imported and could be default not attempt to capture and cache images at all
      • are free services out there, but they may not be there in 6 months
  • (tick) Offering the ability to embed SPARQL query results in individual pages on a per-class basis – for example, to show all research areas represented in an academic department
    • Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-108
    • Stephen would like to have a way to list the classes that would be allowed to use that feature – to use it more broadly than for academic departments, for example
      • And a concept doesn't have the reverse of that, so from a concept you can see the departments involved and not just the people – Stephen will be building that
      • Jim – the data getter is specified in an RDF file that is loaded when the application is started up, saying that a class has a data getter in it
        • Stephen – is there a way to make that a list?
    • Jim – have wanted to make this part of the application configuration ontology
    • Stephen – is developing visualizations around shared research interests

...

  • add the switch to configuration

    • (thumbs up) 
      Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5DuraSpace JIRA
      keyVIVO-110
  • offer it as a standard predicate to add values for, modifying the interface to allow adding and removing those statements and to display the statements

    • (thumbs up) 

      Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-111

More general support for sameAs

...

  • (thumbs up) 
    Jira
    serverDuraSpace JIRA
    serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
    keyVIVO-101
  • Implementing a web service interface (with authentication) to the VIVO RDF API, to allow the Harvester and other tools to add/remove data from VIVO and trigger appropriate search indexing and recomputing of inferences.
    • This would also enable round-trip editing of VIVO content from Drupal or another tool external to VIVO via the SPARQL update capability of the RDF api
    • Authentication will be involved
      • Could manage in our own authentication and authorization system and tell Apache that the servlet requires an HTTPS connection
      • This approach would allow testing in a known environment without having to set up SSL certificates
    • It would help the user experience if it's possible to bundle together an atomic change set (at least all those changes for one graph), so additions and retractions would not show up piecemeal
      • Note that since inferences are done is a separate thread there may still be some lag
  • (warning) Put and delete of data via LOD requests – this has been suggested but we're not sure a specification even exists for an LOD "put" request – please add references here if you're aware of discussion or documentation.
    • when we were design the RDF API, if we allow anybody to execute an arbitrary SPARQL update or delete, you can't just listen to the changes, so we limited what we supported through the RDF API to adds or deletes, using just a subset of the overall language
    • the idea of being able to pipe an arbitrary update or delete through our API would take some work but is theoretically possible
  • Stephen is willing to test for the Harvester

...

(question) 

Jira
serverDuraSpace JIRA
serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
keyVIVO-23

This would support reading and writing data like class groups

...

  • Provide a way to re-index by graph or for a list of URIs, to allow partial re-indexing following data ingest as opposed to requiring a complete re-index
    • Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-98
  • The same desire applies for re-inferencing, which is typically more time consuming
    • Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-99
    • Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyVIVO-100
    • However, re-inferencing is potentially more complicated because our simple reasoner depends on knowing the delta – what has been removed as well as what has been added – and this may be more complex than search re-indexing a specific set of URIs
  • Implementation of additional facets on a per-classgroup basis – appropriate facets beyond ref:type, varying based on the nature of the properties typically present in search results of a given type such as people, organizations, publications, research resources, or events.
    • Huda Khan has been implementing the ability to configure additional search facets for the Datastar project; some improvements may make it into 1.6
  • An improved configuration tool for specifying parameters to VIVO's search indexing and query parsing
    • Question – are any of these run-time parameters or are they all parameters that must be baked in at build time, requiring re-generation of the index?
    • Relates to another suggestion for a concerted effort to explore what search improvements Apache Solr can support and recommendations on which to consider implementing in what order
    • Changes are not expected for 1.6 – more requirements are needed before this work can be prioritized or scoped.
  • Improved default boosting parameters for people, organizations, and other common priority items
    • Here the question immediately becomes "improved according to what criteria"
    • This is a prime area for a special interest group of librarians or other content experts willing to document current settings and recommend improvements, including documenting use cases and developing sample data that could be part of the Solr unit tests listed above under "Installation and Testing"
  • Improving the efficiency and hence speed of search indexing in general – we have no indications at the moment that search indexing is being a bottleneck.  It can take several hours to completely reindex a major VIVO such as Florida or Cornell, but the ability to specify a single named graph or list of URI's to index would address most of the complaints around the time required search indexing after adding new data via the Harvester, which does not trigger VIVO's search indexing or re-inferencing listeners

...