Date: Fri, 29 Mar 2024 09:26:18 -0400 (EDT) Message-ID: <2129896738.29.1711718778461@lyrasis1-roc-mp1> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_28_1599765025.1711718778460" ------=_Part_28_1599765025.1711718778460 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
This meeting is a hybrid teleconference and IRC chat. Anyone is welcome = to join...here's the info:
Any pending issues from the last two weeks?
Webrecorder Integration with Fedo= ra:
Current proof-of-concept:&nbs= p;http://fedora.webrecorder.net/
Webrecorder writing WARCs, re= ading from Fedora (no data model, just flat list so far)
Using Fedora=E2=80=99s HTTP r= ange request support
Goals:
Create PCDM data model for we= b archives
Store WARCs, as well as other= web archiving objects created by Webrecorder
4.7 LTS?
Next release: 5.0.0
Status of "in-flight" tickets
Please squash a bug!
Tickets resolved this week:
Tickets created this week:
1. Any pending issues from the last two weeks?
There is a PR for the import/export tool.=
2. Webrecorder Integration with Fedora
Webrecorder project (Ilya and others) hav= e been looking into having Fedora as a backend. Webrecorder is an interacti= ve web archiving tool. Anyone can use it to record sites. They'd like to ad= d a preservation backend. One of the things they'd like to do in the future= is to have a standardized way to preserve web recordings. This is an area = that's currently lacking as far as web archives are concerned. The current = prototype, linked in the schedule, was a very quick weekend project, intend= ed as a proof of concept (but, as Andrew noted, works beautifully). There i= s currently no tool that provides preservation and access. This integration= that they're proposing addresses that.
The Webrecorder folks are interested in a= discussion on the data model for web archives. The PCDM discussion group (= pcdm@googlegroups.com) would be the best place to ask these questions= . They're interested in these sorts of discussions and that would be the be= st way to move the conversation forward. The Webrecorder folks have started= brainstorming about the data modeling in a Google doc: https://docs.goog= le.com/document/d/1RiZnX4g3u1ydwX9odu5Y1s2ajquhIkqYema5Tzp5UOQ/edit?ts=3D59= 6fe0a4.
Web archives are large objects so they're= interested in learning how well Fedora handles this type of material. Andr= ew reports Fedora (Esme) has tested up to a TB file and that the tests have= been successful.
State of S3 backend storage and clusterin= g? Clustering, as far as Andrew is aware, has not been exercised very much = (or at all). It's sort of a feature Fedora gets from Modeshape for free. Th= ere have been issues at the Modeshape level that have driven their recent w= ork. So there is clustering in Fedora, but there are sprinkled caveats all = around it. As for S3, with the most recent release, there is official suppo= rt for S3 as a backend. Danny Bernstein has done some testing of the perfor= mance of S3 as a backend. Its performance is more or less in line with a lo= cal installation. S3 may not be deployed anywhere in production since it's = really new. It is going into the Hyku deployment though so it will be pushe= d on more. Several Samvera people report that there is also S3 integration = at the Samvera level (though this is different from Fedora's integration).<= /p>
S3 support would be a higher priority for= the Webrecorder groups because currently they store everything on S3. Fedo= ra's S3 support is really undocumented at this point so maybe the Webrecord= er folks working through this might be a good way to get some documentation= . Or maybe some back and forth between Fedora and Webrecorder folks would b= e a good way to generate some documentation.
Related: there is work going on around sp= ecifying the formal API of Fedora, which will probably be slightly differen= t from the one Webrecorder is currently using. Just as a note.
3. 4.7.4 release
4.7.4 release is out now. Fedora will be = targeting a 5.0 release next that will have some breaking changes as the Fe= dora specification is finalized. The idea of Fedora having a long term supp= ort (LTS) version that the Fedora community would support for a period of y= ears was discussed. This would mean that patches would continue to be appli= ed to 4.7.x. Fedora 5 is only notional thing at this point and it will be q= uite some time before people migrate to it. Discussion agreed that having a= n LTS release is a good idea. What types of fixes can folks expect? Securit= y fixes are definitely in. The group ought to articulate what will and won'= t be done. Another example is the project's dependencies (Java, itself, and= otherwise); should underlying versions be upgraded over time? Java, defini= tely. Maven dependencies might be upgraded on a case by case decision(?)
4. Performance lessons from PREMIS events (Ben Pennell)
Ben has tested different ways of storing = PREMIS events (objects vs. RDF logs (serialized RDF in a binary)) There are= some graphs in the Google Groups message. Storing events as objects, of co= urse, results in more objects in the repository and Ben did find performanc= e implications for this. At around 50k objects, the performance was getting= significantly slower for creating events as resources/objects. Ben and his= group's conclusion was that it didn't seem like it would be a good idea to= keep events as objects in Fedora. As an alternative, for any object in the= repository, there would be an RDF log where PREMIS events would be stored.=
5. Volunteer for next week's tech meeting (8/24)?
Someone willing to host next week's call?= Andrew will be at a Fedora users group meeting in Texas. Aaron volunteered= .