This is the detailed design document for the harvesting of National Library of Medicine publications.

Overview

Method used to ingest data from NIH EUtils interface. It is a subclass of NIHFetch which has a description of the command line arguments.

Useage

NLMJournalFetch="java $OPTS Xms$MIN_MEM -Xmx$MAX_MEM -Dharvester-task=$HARVESTER_TASK -Dprocess-task=NLMJournalFetch -cp bin/harvester$VERSION.jar:bin/dependency/* org.vivoweb.harvester.fetch.NLMJournalFetch"

Methods

serializeFetchRequest

  1. create EFetchJournalsServiceStub
  2. create EFetchResult from EFetchJournalsServiceStub with EFetchRequest which was passed in.
  3. Using an XMLStreamWriter to create a MTOMAwareXMLSerializer

sanitizeXML

  1. take in String
  2. replace invalid characters
  3. write out sanatized xml