The performance of the bridge application depends on how well it can utilize the computing capacity provided by the system on which it is deployed.

A major component in this is how well the CPU cores of the machine (or VM) are utilized. If the bridge application is configured to use too few threads, then not as many transfers will occur concurrently as is possible, but if the bridge application is configured to use too many threads, then there will be a significant amount of context switching within the CPUs to allow time for each thread to execute, and overall throughput will suffer. The charts in this document help to illustrate these trends, based on processing of checksum values.

So what is the right number of threads?

This depends on the number of available CPU cores. The goal of this exercise is to configure the bridge application such that the normalized load average of the machine is just above 1. A "normalized" load average is the load average reported by "top" or "htop" divided by the number of CPU cores available. Having a normalized load average just above 1 means that all the cores are being utilized, but there is not a significant amount of context switching occurring.

How do I configure the bridge application to meet this goal?

There are two primary points of configuration that matter. The first is the bridge application's context.xml file. In the bridge application WAR file, this file is found under WEB-INF/config/. In this file are the definitions for thread pools used by the application.

The following settings were used with the bridge application running on a VM with 24 CPU cores:

  <bean id="jobTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
    <property name="maxPoolSize" value="7"/>
    <property name="corePoolSize" value="7"/>
    <property name="queueCapacity" value="1000"/>
  </bean>

  <bean id="itemTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
    <property name="maxPoolSize" value="200"/>
    <property name="corePoolSize" value="20"/>
    <property name="queueCapacity" value="10"/>
  </bean>

The jobTaskExecutor is the thread pool for snapshots, meaning that the number defined here determines the number of concurrent jobs which can be processed at one time. We have found that making this a fixed size pool works best (meaning the maxPoolSize and corePoolSize settings match). Setting the queueCapacity high allows a large number of snapshots to be waiting in the queue while others are processed (making this value too low means snapshots over the limit will be rejected, which is not what you want.)

You will be changing the settings of the jobTaskExecutor based on the CPU cores available.

The itemTaskExecutor is the thread pool for individual files within a snapshot. Allowing this to be a smaller core size with a lot of room to grow works well. There is generally no need to change these settings.

The second primary setting is an environment variable passed to the bridge application on startup. Again, this setting was used on a VM with 24 CPU cores:

-Dduracloud.bridge.threads-per-job=5

This setting defines the number of threads that can be allocated for transferring files for each snapshot that is in progress.

What setting values should I use?

Together, the size of the jobTaskExecutor pool and the threads-per-job setting define your CPU utilization. Exactly how this translates to normalized load average can only be determined by experimentation.

The production DuraCloud Vault bridge application during a period of high activity in Jan-Mar 2018 landed on these settings, which provided the preferred load average:

CPU Cores24
jobTaskExecutor size (maxPoolSize and corePoolSize)7
threads-per-job5

The total potential threads with these settings is (7 * 5) = 35. 

This was with a setting of 24 CPU cores on the bridge VM. This suggests that a ratio of potential threads to CPU cores of (35 / 24) = 1.45 is about right.


Attempting to achieve the same ratio for a machine with 8 CPU cores, might suggest settings like:

CPU Cores8
jobTaskExecutor size (maxPoolSize and corePoolSize)4
threads-per-job3

Total potential threads (4 * 3) = 12. Ratio of potential threads to CPU cores = 1.5.

  • No labels