This page will be used to collect materials for an environmental scan of literature related to software upgrades and migrations as well as planned or recommended Fedora 3.x - Fedora 4.x upgrade projects.

Sources

Summary

Repository upgrades and migrations are quite common, and the literature covers several important aspects of this process: motivations for undertaking a migration, the difficulty of migrations, the possible benefits of a migration, and advice for those looking to undertake a migration in the future.

A common motivation for repository migrations is the cost of a commercially licensed product. Gilbert and Mobley were facing an increased cost to their CONTENTdm license due to reaching the item limit of their current tier, and Stein and Thompson cited license and maintenance fees as one of the main drivers of repository migrations based on survey data. Issues with the commercial platform itself, from performance and scale limitations (Neatrour et al., Witkowski et al.) to a lack of flexibility with regard to file and metadata formats (Gilbert and Mobley, Wu et al.), were also key motivators. Finally, better support for digital preservation (Stein and Thompson, Berghaus et al., Fallaw et al.) and linked data (Wu et al., Stein and Thompson) rounded out the top motivators in the literature.

There are many factors that make migrations difficult, but there is one primary problem category throughout the literature: metadata. Van Tuyl et al. cite metadata remediation as the biggest time sink during their migration project, and many others (Bridge2Hyku Team, Gilbert and Mobley, Neatrour et al.) present case studies that involve significant time spent on metadata normalization, de-deduplication, and remediation. This speaks to a related difficulty often cited in the literature: inconsistent or “messy” source data. The process of mapping metadata from one repository system to another would be much simpler were it not for the fact that many legacy systems tend to have metadata quality problems in the form of custom local fields, duplicate fields, and misspelled entries.

There is a great deal of migration advice to be found in the literature, based primarily on lessons learned from migration projects. Tripp summarizes much of this advice into four categories: planning, metadata normalization, migration, and verification. Each of these categories is represented in the rest of the literature; Nowak et al. undertook a great deal of planning for their migration project, while Simic and Seymore invested a lot of time in large scale metadata normalization prior to migration. The migration phase itself was often accomplished with a combination of scripts and manual intervention, and the same is true of the verification step.

Common Themes