Page tree
Skip to end of metadata
Go to start of metadata

Date & Time

  • July 11th 15:00 UTC/GMT - 11:00 ET

This call is a Community Forum call: Sharing best practices and challenges in the use of existing DSpace features

Dial-in

We will use the international conference call dial-in. Please follow directions below.

  • U.S.A/Canada toll free: 866-740-1260, participant code: 2257295
  • International toll free: http://www.readytalk.com/intl 
    • Use the above link and input 2257295 and the country you are calling from to get your country's toll-free dial in #
    • Once on the call, enter participant code 2257295

Agenda


Community Forum Call: Workflow Strategies for Populating Your IR

Sharing best practices, challenges, and questions. The call will be dedicated to answering participants questions and discussing workflows.

Call for DCAT topics

We invite ideas for topics for August through December

Upcoming DSpace events:

Aug 22-23 North American DSpace User Group Meeting https://www.library.georgetown.edu/node/19724


Preparing for the call

Bring your questions/comments you would like to discuss to the call, or add them to the comments of this meeting page.

If you can join the call, or are willing to comment on the topics submitted via the meeting page, please add your name, institution, and repository URL to the Call Attendees section below.

Meeting notes

Maureen convened the meeting and announced that it was a community forum, a time to share practices and ask questions.  Felicity volunteered to take meeting notes.

Community Forum Call: Workflow Strategies for Populating Your IR

Sharing best practices, challenges, and questions. The call will be dedicated to answering participants questions and discussing workflows.

Discussion:

Marianne - University of Kansas - KU ScholarWorks

  • Open access policy since 2009 at the University of Kansas. This only includes journal articles, not books or book chapters.
  • Faculty research in the sciences is often federally-funded.   As a result, works are often shared in PubMed Central. KU uses publisher policy on whether articles can be shared and in what version. In PubMed, one can search by affiliation and filter by accepted manuscripts. Those are the versions that are most often allowed to be shared. With this feature, they are finding a lot more accepted manuscripts.  Marianne found 900+ accepted manuscripts since 2009.  They check the journals in SHERPA/RoMEO to ensure that this version can be shared in a local IR and added those manually to ScholarWorks.   Marianne set up a monthly search update in PubMed Central so that she now gets a list of recently added accepted manuscripts.  (Here's the search: (University of Kansas[Affiliation] AND Lawrence[Affiliation]) AND "author manuscript"[Filter]) This results in an emailed list of 3-4 articles per month, which are then reviewed to make sure that there is a KU-Lawrence author and deposited manually.   
  • Question: Are authors aware that you are adding these?  Marianne: Generally not. They send an email to departments each year to advise them of actions.  
  • Other sites do not always have the searching/filtering capabilities of PubMed Central or an API that would suit our needs.  BioMed Central: have changed their search mechanism so it is difficult to find the KU material without paying them.  PLOS: Creative Commons license, so they can use all publications, but hard to determine which are from KU-Lawrence authors; search finds all KU authors, whether from KU-Lawrence or KU-Medical Center.  They don't deposit KU-MC articles unless there is a KU-Lawrence co-author.  
  • Marianne only gets publications for faculty from the KU-Lawrence campus. The open access policy does not cover the KU Medical Center.  
  • APIs not always useful, especially with commercial vendors, since they aren't always specific enough to pinpoint those articles produced by KU-Lawrence faculty.
  • Since 2012, faculty are required to put a list of their works in a central database called Professional Records Online (PRO), (Digital Measures).  They get an export each year of the citations in PRO that were added in the last year.  The number is usually about 2500 article citations.  Books/book chapters are not included.  They do an automated SHERPA/RoMEO check to determine which version they can add, enhance metadata using Crossref's API, and eliminate duplicates.  Articles with publisher policies allowing published version to be shared are automatically deposited.   Departments are notified annually by the Libraries that this is happening. 
  • Faculty can upload accepted manuscripts to PRO, but only around 60 or so per year uploaded, and usually these are not in the version that can be shared.
  • In 2010, the repository had 4,000 items and now the number is over 20,000.

Gail - Virginia Tech

  • Virginia Tech uses Symplectic Elements, which they started using this in the past spring.  Faculty have the option to deposit  articles to the IR.  Received 1,100 this last year.  The faculty upload the articles.
  • They do not have an open access policy. A draft policy is before the faculty senate.

Terry - Georgetown

  • Batch processing of ETDs with files from ProQuest.
  • They are loaded directly into the repository.  The program groups them by academic department.  Staff manually initiate ingest into each collection.  They are departmental collections.
  • They have a simple workflow process
    • Have an ingest collection that can't be read, but people can deposit items.
    • Custom metadata gathering form for that collection.  It has customized license agreements.
    • Two-step review process using 
  • They put academic and digital collections into their IR. They developed a process to add items in a test environment.  They create a new collection in production to preserve a handle.  Put it in preview.  Export it as an AIP package and do development in a test environment.  Then they upload the features and into production environment and then move them out of preview.
  • Marianne - have an automated workflow for getting ETDs from ProQuest to ScholarWorks.  The ETDs end up in a DSpace workflow and are reviewed and approved for deposit.  There is one ScholarWorks collection for theses and another for dissertations. 

Question: How are people finding content?

  • Gail - Virginia Tech
    • Manual labor
    • Looking for gray literature to add to the repository.  Checking websites of new collaborations between colleges and institutes to find research products.
    • Plans to involve liaisons who will let each department know of the activities.
    • Faculty also submit items directly. Get a range of material:  digital files, print journals. They also use SHERPA/RoMEO to check policies.

Question: Are people contacting individual faculty members or departments for content?

  • Marianne.  They are working with the department head of an engineering unit to add its report series to ScholarWorks.  They do contact departments and request permission to add items to ScholarWorks as they come across possible content.  They assume the department has the rights, unless told otherwise.  As part of the decision to share a departmental series in ScholarWorks, there is usually a discussion at a faculty meeting about making a departmental publication series available; since the faculty are usually the authors, we are getting tacit permission from them,even if the department doesn't have the rights to the publication.  Administrators will take content down when requested.
  • There is a separate, more formal, permission form for student work because of concerns about FERPA.   The form is scanned and uploaded to the item record as a License file visible only to administrators.

COAPI (Coalition of Open Access Policy Institutions) discussion

  • Marianne said there is a current discussion about harvesting content (primarily pre-prints that are not peer-reviewed or published) from the ArXiv repository for Physics scholars.  Questions remain about whether pre-prints are versions that should be preserved in institutional repositories.

Question:  Is anyone using the Elsevier/ScienceDirect API for automated ingest?

  • It looks like this is just for metadata, with links to article files on publisher's web sites.  
  •  "Automatic ingestion of metadata and abstracts of all articles and chapters by authors affiliated with your institution."–Elsevier site. 
  • It would be great if contracts with Elsevier could be modified so that they would provide repositories with accepted manuscripts for faculty authors at that institution. 

Call for DCAT topics

We invite ideas for topics for August through December.

August: DSpace 7.

Add other ideas to the comments page.


Upcoming DSpace events:

Aug 22-23 North American DSpace User Group Meeting https://www.library.georgetown.edu/node/19724

Terry: Georgetown University is hosting this 1.5 day meeting.  They are targeting this for repository managers and developments.  There are a few spots remaining.  They are accepting topic ideas. 

Call Attendees

  • No labels

2 Comments

  1. Here is a link to the ETD upload tools that we have developed at Georgetown: https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki/DSpace-Institutional-Repository-Ingest

    If these look useful for your institution, feel free to reach out to me for more information.

  2. I have a suggestion for a DCAT topic for a call sometime this fall:  DSpace File Format Registries and implications for preservation/migration of file formats

    DSpace has a default file format registry, but it's modified by each repository.   I'd like to discuss the following with other repositories:

    • How are repositories determining which file types are "supported" "known" or "unknown?"
    • For those that are "supported," how are the repositories actively migrating file formats for preservation purposes?
    • What are the tools and best practices for this migration?

    MIT has excellent documentation for their format policies on their Policies page:  see the Format Support section.   This section has clear descriptions of their Unsupported (unknown), Known, and Supported categories, with clear non-jargon descriptions of what each category means. What's not clear to me is what tools MIT is using for the Supported file formats to "make usable in the future, using whatever combination of techniques (such as migration, emulation, etc.) is appropriate given the context of need. 

    Discussion of how MIT or other repositories are doing this would be helpful.