Date: Fri, 29 Mar 2024 11:27:21 -0400 (EDT) Message-ID: <962855547.265.1711726041897@lyrasis1-roc-mp1> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_264_1042035416.1711726041897" ------=_Part_264_1042035416.1711726041897 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
We will use the international conference call dial-in. Please follow dir= ections below.
In preparation of the call, you could do the following:
In the DSpace 6.0 release the performance enhancing efforts were not ent= irely successful. However, in the release of DSpace 6.1 these should be fix= ed, making DSpace 6 in general terms more performant than DSpace 5.
To test this statement it would be good if we could set up two identical= server environments on which we deploy respectively a DSpace 5 and a DSpac= e 6. If these repositories are then populated with the exact same content w= e can make a objective comparison of the performance of DSpace 5 and 6.
In DSpace 6.0 JSPUI, when a repository has many communities and collecti= ons this can cause a performance issue. In such repository, during the coll= ection selection step in the item submission process, the collection list t= akes a long time to load. This issue is currently under investigation.
During the call there were some other issues reported which are related = to the above. For example, for repositories with many communities and colle= ctions performance appeared to be decreasing when upgrading to newer DSpace= versions for one participant. This attendee also notices performance issue= s in indexing repositories with many items.
The fact that these issues were not detected during the testing phase of= DSpace 6.0 reflects a more general issue with DSpace performance testing. = This testing is currently done on the DuraSpace Demo repository (demo.ds= pace.org). This repository however is usually populated with only limit= ed amounts of communities, collections, and items. At this point we are not= testing DSpace's performance on large repositories. It would be good if we= could set up such testing environment for future releases.
One popular proprietary tool for server monitoring is New Relic. It can = detect significant changes in the use of resources and send alerts when thi= s happens. It also lets you know at which time an issue occurs. New Relic i= s also capable of pinpointing lines of code which may have caused the perfo= rmance issue.
A low tech way of doing basic test of your repository's performance is b= y using your in-browser developer tools, which are included in many modern = browsers. In most cases you can access these tools by right-clicking in you= r browser, and selecting an option such as 'inspect' or 'developer tools' w= hich should pop-up a pane at the bottom of your browser screen. This pane w= ill likely have a network tab, in which you can monitor the loading times o= f pages in DSpace while you are testing features. This will provide you wit= h hard numbers you can use to compare your performance over time.
There are several configurations which may impact your repository's perf= ormance.
One Tomcat configuration setting you can use to increase performance is = the crawler session manager, which can restrict the number of sessions for = a crawler user agent. If bot traffic generates performance issues limiting = the maximum amount of sessions for those bots may help.
The standard PostgreSQL settings are not ideal for repositories with muc= h traffic. For these repositories it is better to increase the maximum data= base connections.
During the call it was also not certain why the default PostgresQL setti= ngs allow for an unlimited number of idle connections.
Solr is memory intensive, and runs alongside DSpace in the tomcat applic= ation server. This means it will have to share its available memory with DS= pace.
As solr is recording all the DSpace usage events (item page views, bitst= ream downloads, search queries), the memory usage of solr is related to the= usage of the repository. Repositories with much usage may also require mor= e memory for their solr.
One way of limiting the memory usage of solr is not writing any robot tr= affic to the solr core.
One tool which can be used for load testing is loadimpact.com, the f= ree tier should already suffice for most repositories. It is advised to be = cautious when using this tool, as increasing the load on your DSpace may ev= entually lead to a failure.
Another tool used by a call attendee is Apache JMeter (http://jmeter= .apache.org/). This tool is free and has the capability of capturing br= owser settings.
Codebase-fixes can be contributed just like any other code-fix. However,= there seems to be a need to centralize more information regarding environm= ent-specific optimizations: