Date

Call-in Information

Time: 08:00 am, Eastern Time (New York, GMT-04:00)

To join the online meeting:

  • https://lyrasis.zoom.us/j/82670709536?pwd=MzF3NDladE1DKzEvUml4SGQ5eUFsQT09

    Meeting ID: 826 7070 9536
    Passcode: 008047
    One tap mobile
    +16699006833,,82670709536#,,,,*008047# US (San Jose)
    +19292056099,,82670709536#,,,,*008047# US (New York)

    Dial by your location
            +1 669 900 6833 US (San Jose)
            +1 929 205 6099 US (New York)
            +1 253 215 8782 US (Tacoma)
            +1 301 715 8592 US (Washington DC)
            +1 312 626 6799 US (Chicago)
            +1 346 248 7799 US (Houston)
            877 853 5257 US Toll-free
            888 475 4499 US Toll-free
    Meeting ID: 826 7070 9536
    Passcode: 008047
    Find your local number: https://lyrasis.zoom.us/u/kbEatBA0od

Slack

Attendees

  1. Dragan Ivanovic 
  2. Michel Héon
  3. Jose Ortiz
  4. Abhishek Raval

Notes

Michel Héon and Dragan Ivanovic discussed different aspects of crosswalks. Michel Héon is afraid only integration with VIVO in read mode is feasible for the first phase of the project, meaning migration of data from DSpace to VIVO, and periodical update of records.  There is a problem with Person entities, in DSpace list of authors are in a string, there might be different variation of names (e.g. Michel Heon, Heon Michel, M. Heon, etc.). If we want to harmonize that, a person identifier is needed. One solution is to do that manually, but requires a lot of effort for some big DSpace databases. The simplest implementation for this manual work is adding a personIdentifier field in the DSpace, and it will be used in the first phase. A little more complex solution is adding a vocabulary of name variants. This vocabulary might be manually maintained or by developing some AI tool for matching variation names to one representative.  The third solution is to make it inaccurate by using some stringToIdentifier approach, in this case variation of names will be completely ignored. 

Dragan Ivanovic reminds that the main point of the first phase of the project is to attract attention of the community, and hopefully to make them interested to fund the next phase. A crowdsource funding might be organized similarly to DSpace Development Fund (https://duraspace.org/introducing-the-dspace-development-fund-ddf/). For that reason, we need a really effective demo of what is done. One scenario might be integratio with a VIVO of a clear and accurate small DSpace repository (10-20 items prepared for this purpose). The another scenario might be integration of https://demo7.dspace.org/ with VIVO.  Abhishek presented how those data might be migrated to local instance of DSpace for experimenting. However, the another solution might be attaching harvester to this endpoint (without using local DSpace instance) - https://api7.dspace.org/server/oai/request?verb=Identify.

Michel Héon believes he might have a demo with more data ready for the next meeting (Thursday 5th of May). For that purpose Jose's harvester should be merged in the main branch. Dragan Ivanovic will send a message to Jose with request to merge his contribution to the main branch until the Monday 2nd of May. 

Dragan Ivanovic asked Michel Héon when might be the right moment for presentation of the project to the VIVO tech IG. Michel Héon thinks it might be Tuesday 10th of May. It should be an effective presentation with the idea to collect the feedback from the group for the improvements of our approach and implementation. 

See:

  1. Architectural proposition
  2. About DSpace rdfizer
  3. About an Item in DSpace
  4. DSpace-VIVO GitHub Repo
  5. Crosswalk - https://docs.google.com/document/d/1waKCEaeEVRO6S1XUX8QmcACH-hZlZFLMZf0masDOx4o/edit?usp=sharing

Task List

  • Dragan Ivanovic will send a message to Jose with request to merge his contribution to the main branch until the Monday 2nd of May. 
  • Jose will complete the harvester implementation and merge his PR to the main branch
  • Jose will check whether his harvested is capable to harvest DSpace items from the open demo server (https://api7.dspace.org/server/oai/request?verb=Identify).
  • Michel Héon will work further on transforming and loading the data to the VIVO instance. 

Previous tasks 

  • Dragan Ivanovic to review PR
  • Jose to improve dspace docker related scripts
  • Michel Héon will try to work with data coming from a DSpace, meaning not to use only transformation and loading, but also extraction (harvesting) implemented by Jose.
  • Jose to work on harvesting other elements of model (Community, Collection, etc.)
  • Jose to write a script for running DSpace Docker
  • Michel Héon to work on ingesting of DSpace items in the VIVO graph, at the beginning it will be tested through SPARQ query, later added in the VIVO UI.
  • Michel Héon Implementing DSpace-item in VIVO UI
  • Michel Héon Desing architecture for mapping a DSpace-person to VIVO-person
  • Jose to define an example of Communities hierarchy and to share that with Michel
  • Michel Héon Adapt the DSpace-Item representation in the dspace-vivo exchange data schema (DVExDS) according to José's requests
  • Jose to continue working on getting allDSpaceItems request for DSpace 7, 6 and 5.
  • Abhishek to check whether all options for harvesting are possible for implementation from the perspective of model and APIs of DSpace
  • Dragan Ivanovic to define options for harvesting data from DSpace in a wiki page 
  • All to think about the name for middleware internal data structure defined for the needs of massive migration of data from DSpace to VIVO
  • Michel Héon I propose the name 'dspace-vivo exchange data schema'.
  • Jose to continue working on his implementation in accordance with defined options for harvesting data
  • Jose to rename packages in his implementation,
  • Michel Héon to do the same for his contributions
  • The work is completed and the dev-heon branch has been merged by hand into the github repo
  • Michel Héon to think about stronger integration of migrated data in the VIVO graph, and how those data might be linked for synchronization in the case of modification
  • Dragan Ivanovic to make consultation with Abhishek about moving meeting for one hour, and if Abhishek agrees, Dragan Ivanovic will send new ics file. 
  • Michel Héon to prepare and share slides for discussion about architecture
  • Michel Héon and Jose to set up environment, i.e. to install DSpace 7, DSpace 6 and VIVO 1.12.x, Abhishek to help them with DSpace if necessary



  • No labels