Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. "Sufficient copyright licenses to enable permanent archiving, access, and reuse of publications"
    • General Comments
      • Many repository platforms do have an option to require a "deposit license" which often covers these scenarios. However, the text of the "deposit license" is decided by the institution. There may need to be "recommended copyright license language" provided by a central entity, to help ensure locally created licenses are "SHARE-compliant".
      • Does this need to be machine actionable / verifiable?

General Repository Functions

As described in the "SHARE workflow" paragraphs, a repository would need to support the following functions:

  1. Be able to accept XML versions of manuscripts from Journal publishers
    • "Journal submits XML version of final peer reviewed manuscript to the PI's designated repository

      "

    • General Comments
      • Who defines this XML format?  It would need to be defined by a central entity.
      • Is there a reason why XML is chosen as the transmission format instead of a protocol like SWORD (with a common packaging format)? 
        • As XML is not human-readable, this implies we'd need a more human-readable format as well (PDF or similar), which is why SWORD may be useful here.
  2. Make article available to search engines
    • Google, Google Scholar, Yahoo, Bing, etc
  3. Must be able to link to publisher's website
    • General Comments
      • How is the publisher's website link obtained by the repository? Is there a way to "look it up" via a central service, or would it be a required metadata field?  If it is the latter, what happens if the publisher's website changes it's URL?
  4. Support embargo
    • link to publisher's website until embargo period expires
      • See comments above about how do we obtain the publisher's website link.
    • make full-text of article available post-embargo
  5. Certify compliance with agencies
    • Automatically notify "both the funding agency and the PI's institutional research office that a deposit has occurred"
    • General Comments 
      • How would repositories know where to send notifications to?  What type of notification?
      • Is this a "push" notification (e.g. automated email to agency), or is it more of a "pull" notification (where an agency could query repositories for recent deposits)?
        • If the agency just needs to query the repository for recent deposits, perhaps would could use OAI-PMH. But, at the same time, the funding agency couldn't be expected to query 100's of repositories for this data. It'd need to be a centralized location that could be queried

Requisite Conditions

As noted in the proposal, the "following precursors are required immediately to implement SHARE as a solution to the OSTP memorandum.":

  1. Principal Investigator (PI) Identifier (recommended to use either ORCID or ISNI)
    • General Comments:
      • Is capturing this identifier as a simple metadata field "good enough"?
      • Are researchers expected to just enter their own ORCID?  Or do we need some sort of more complex "lookup" for each author entered?
  2. Award Identification Number - assigned by Federal agencies
  3. Copyright License Terms - "requires a standardized and coded expression ... for machine processing"
    • General Comments:
      • How would this be "coded"?  We'd need a centrally defined "standard" representation that all repositories can attempt to implement.
  4. Repository Designation ID Number - "to identify the repository access location"
    • General Comments:
      • Who defines this "number"?  Could this simply be the repository URL, or a persistent identifier which resolves to the repository URL?
  5. Preservation Rights - "required to be coded into the metadata residing with the record"
    • General Comments:
      • How would this be "coded"?  We'd need a centrally defined "standard" representation that all repositories can attempt to implement.

Phase ONE (12-18 months)

Additional requirements for Phase One, after which "the SHARE system will be available for both deposit and access".

  1. PI Identifier  (Also mentioned in "Requisite Conditions")
    • See comments under "Requisite Conditions" above
  2. Award Number (Also mentioned in "Requisite Conditions")
  3. Publication ID - "unique, persistent identifier to reference the journal article of the publication"
  4. Data Set ID - "resolvable, persistent identifier to location of stored data or data sets that are linked to the published article"
    • General Comments:
      • Where are these data sets expected to reside? Is the repository capturing the dataset and assigning the identifier, or is it assigned by an external system?
  5. Copyright License Conditions (Also mentioned in "Requisite Conditions")
    • includes embargo information
    • See comments under "Requisite Conditions" above
  6. Sponsoring/Funding Agency Name - "Link to agency providing funding so that reports can be automatically returned"
    • General Comments:
      • If this is primarily used for reporting, it's likely we also need to capture an email address or a URL / identifier.  It depends on the decisions around reporting.
  7. Reporting - "Creates a feedback loop to the federal agency and the PI's research office providing tracking of publications resulting from awards funded by the agency"
    • General Comments:
      • What type(s) of reports are expected?  How would these be made available to the agency / research office?
      • Is this a "pull" (agency/research office can visit the repository and view/request necessary reports), or a "push" (reports are automatically sent from the repository to the agency / research office by some means)?
        • As far as repositories are concerned, obviously a "pull" is easier. A "push" would require the repository to know where to send such reports (up-to-date email addresses or similar)
  8. Core Usage Statistics - "Reports to authors (and agencies, if desired) include statistical data on usage activity and downloads of their publications."
    • General Comments:
      • What type(s) of statistical reports are expected? Would there need to be some "minimal required statistics" to capture/report? How would the reports be made available to the authors and agencies?
      • Is this a "pull" (authors/agencies can visit the repository and view/request necessary reports), or a "push" (reports are automatically sent from the repository to the author / agency by some means)?
        • As far as repositories are concerned, obviously a "pull" is easier. A "push" would require the repository to know where to send such reports (up-to-date email addresses or similar)
  9. Metadata Exposed to Search Engines
  10. SWORD
  11. OpenURL
  12. Some connections to Digital Preservation Network (DPN)? - "All phases connect with and take advantage of the Digital Preservation Network (DPN)"

...