Contribute to the DSpace Development Fund

The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.

2017-10-13 DSpace Entities WG Meeting Notes

Created by Tim Donohue, last modified by Paulo Graça on Oct 16, 2017

Attendees

Agenda/ Notes

We discovered immediately that this Google Hangouts has a 10 person limit. Some people may have been blocked out of discussion. Apologies for anyone who got kicked off, but we'll meet on Zoom for the next meeting (see below)
About the Working Group / How it was formed (Google Slides)
- About RCAAP : Presentation by Paulo L
- Motivation : Presentation by Paulo G
- Work done (by RCAAP): Presentation by Paulo G
  - Scenario #1: DSpace-CRIS pilot (decided they were not ready to move to yet)
  - Scenario #2: Built out DSpace 5/6 with custom features (author profiles from Atmire adapted to JSPUI)
  - Scenario #3: Decided they'd rather participate in Dspace development. This is how discussions started and this WG was established
- Entities Working Group purpose and objectives: Presentation by Jose Carvalho
- Identified a few models to work from
  - Top / Down (starting at DSpace-CRIS)
  - Bottom / Up (starting at community needs)
Discussion
- Tim: Could we mesh the two models (top/down and bottom/up) together? Start at community needs, prioritize those needs...analyze how DSpace-CRIS can meet these needs?
  - Andrea B agrees. Same with Mark W
  - Andrea reminds of DSpace 2.0. Tried to find "best solution", but it was too distant from current DSpace. Impossible to migrate to it. We don't want to repeat that obviously
  - DSpace-CRIS built as an addition (different dev pattern) for this reason. Keep it close to DSpace roadmap.
- Andrea: DSpace-CRIS is also a larger community. 70 installations in Italy. 30-40 around the world, with increasing interest. Discovered interest in Germany too (at DSpace German User Group Mtg)
  - Don't want to lose these users to publishers or proprietary CRIS systems
  - Ideally, any new CRIS-like system enhancements should be an evolution of DSpace-CRIS
- Should we start at use cases (and prioritize)? Then analyze how DSpace-CRIS aligns (or doesn't)
  - Some agreement here
- Can you split DSpace-CRIS into software parts? Or is it an entire bundle?
  - Can enable/disable functionality easily
  - But, it is shipped more like a bundle right now.
- Joao: Perhaps it'd be good to start with one major use case and do a deep dive into DSpace-CRIS to see how it handles it?
  - Lieven agrees. Use cases take a while to fully build out. Start in one area first. Get a good understanding of DSpace-CRIS
  - Tim agrees. Author or Author Profile pages is a good place to start? That is also a frequently requested feature outside of CRIS
- Others agree that Author / Author Profile pages (aka Research Pages) would be a good starting point
- Stephen: One of the intro presentations implied that DSpace-CRIS is "closed"? Is it really? Impression is that it's open, but just a slightly separate community
  - No it isn't closed. But, it is primarily developed and supported by one institution (4Science). So, it's not as widely known/understood
  - But it has the same open source software license as rest of DSpace. Uses the same mailing lists. Presents at the same conferences (Open Repositories, etc)
- Also, is DSpace-CRIS following DSpace roadmap?
  - Yes, 4Science is highly involved in DSpace 7 efforts and DSpace-CRIS will be moving on the same roadmap as DSpace 7
- DSpace-CRIS solves multiple needs. It's an advanced repo that better integrates with other (external) CRIS systems. It's also possible to use as a standalone CRIS system + repo combo.
  - DSpace-CRIS allows you to enable/disable features to either make it a full CRIS, or just an advanced repo that integrates well with an external CRIS system. Different users configure it differently
- Next meeting plans?
  - Meet in two weeks at same time. Friday, Oct 27 at 15:00UTC (11am-12pm EDT, 8am-9am PDT)
  - Main agenda item will be a deep dive into how DSpace-CRIS manages Authors and Author Profiles (Researcher Pages).
  - Next time we'll meet via Zoom. This will avoid the attendee limits. (Joao has a Zoom account we can use)

No labels

7 Comments

Mark H. Wood
Dashing down my notes. I tend to go deep to try to understand the underlying structure and then come back to reality. I sketched an abstract "entity":
Entity
internal ID
*external ID
*credential
name
*contact (physical address, delivery address, email, phone, etc.)
*profile (person, institution, project, contract....) (presentation hint?) (names required attributes?)
Starred attributes are multi-valued. I noted in passing a vague resemblance to X.500 directory objects. Clearly there are many more attributes to be defined.
Question: can we identify groups of entity types which are so different that they should not be lumped together?
Some Entities have credentials: they can log in. Some Entities have external identifiers such as DOIs or ORCiDs or DUNS numbers or what-have-you, and can be associated with real-world entities. We should be able to represent things like authors who are not local users (no credentials) but who may be uniquely identifiable (ORCiD present).
This may be a little too abstract.
Also a sketch of the problem: we have metadata fields which hold (like all the rest) opaque strings, but we need for these values to become symbols which carry meaning that DSpace can interpret, which can link to other symbols to represent knowledge, and which can be combined to discover new relationships.
- Permalink
- Oct 13, 2017
1. Susanna Mornati (4Science)
  Entities are not just people, they may be anything in the research domain, possessing attributes and links to other entities. The model can be quite complex, see for instance: http://www.eurocris.org/cerif/main-features-cerif.
  So I think the point here is to understand the data model underlying the research domain. Databases and software platforms should just be powerful and flexible enough to represent it.
  Then, for each entity, attributes can be defined, but the data model should remain flexible,
  Some other features should remain an institutional choice, e.g. DSpace-CRIS already allows you to login with your ORCID credentials, but maybe a university would not like anyone not being currently affiliated to login into their IR so this feature has to be configurable (on/off) but not mandatory.
  Please let me know if this post needs clarification.
  Permalink
  
  Oct 17, 2017
2. Paulo Graça
  I'm inclined to agree with Susanna Mornati. One EPerson isn't necessarily an Author. On RCAAP's repositories the reality is more than 70% of mediated deposits. The works deposit is made by a different person than any of its authors.
  I think it should exist a way for a ePerson to also be an Author. As I see it, at a specific moment (after register, or after aproval, or when the user tries to start deposit, etc... ) some data in the EPerson should be used to create an Author (if it does not exist).
  Permalink
  
  Oct 17, 2017
  1. Mark H. Wood
    Exactly. An ePerson with Author attributes is an Author; one with credentials is a User (can login); one with both is an Author who can login. An entity with Publisher attributes is a Publisher, but it may have other roles as well, as shown by other attributes.
    The question in my mind is not, is this not flexible enough, but is it too flexible, too abstract?
    
    Permalink
    
    Oct 17, 2017
    1. Paulo Graça
      
      I think the solution depends on how the entities relationships will work.
      
      Permalink
      
      Oct 17, 2017
    2. Susanna Mornati (4Science)
      
      I think an Entity should be defined by attributes belonging to a specific domain. In the Research Domain, as modeled e.g. by CERIF (http://www.eurocris.org/cerif/main-features-cerif), a Person is typically a Researcher, who is a (co)author of a Publication, a (co)investigator of a project, a (co)assignee of an award, etc etc. So, besides appropriate attributes, their role is defined, as Paulo writes, by their relationship to other entities.
      
      Permalink
      
      Oct 19, 2017
Paulo Graça
The presentation slides: https://docs.google.com/presentation/d/1pAYI4J00iDr4qOIutEQvtaNqfmeYtfR4Qet-0K-CpGg
- Permalink
- Oct 16, 2017

All content on the LYRASIS Wiki is licensed under the CC BY (Attribution) license, unless otherwise noted.