Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: these discussions reflect primarily the approaches and workflow that have been used at Cornell. Other approaches are used at other sites, and please update or annotate as appropriate to point out different requirements and/or solutions.

Introduction

Data ingest is a very broad term that refers to first-time loading of data but must also encompass processes to correct errors and reflect additions and deletions. Data are rarely static, and a general model for data ingest needs to include the context of where data are managed and where the resources to maintain data can be found.  Data quality can bee addressed at five different points in the workflow -- before it leaves the source system (whatever that may be), as a data file before it's brought into VIVO, during data ingest processes, after it's been loaded into VIVO, and finally as a reporting phase back to the source.

Alternative approaches

Some VIVO sites do not allow manual editing by users, but reflect data from one or more other systems of record with VIVO being a point of integration and for syndicating integrated data to other websites or reporting tools. This can simplify data management after it's in VIVO but still very likely requires data alignment unless all the sources of data are internally consistent and share common unique identifiers.

...