Alexandru Ersenie Alexandru Ersenie Moderator

Number of JVM Instances / Server - relation to parallel garbage collection

Hi everybody,
i am currently researching how many JVM instances we can operate on a single Application Server. The decision should be taken of course based on resource availability and requirements, so let's take a look at the problem.
Suppose we had the following hardware configuration:
- 2 Socket - Six Core CPU ( total of 12 Cores, 6 cores / cpu)
- 128 GB RAM
- 8 GB Maximum Heap Size
- 4 GB Maximum Eden Size
- Garbage Collection Strategy: CMS with ParallelNewGC enabled

1. It is clear that we have enough memory for about 10 JVM Instances (let's leave something to the system too) - that would bring us to 80 GB Memory reserved - check.
2. The CMS enables by default also the parallel new garbage collection. JVM allows us to decide how many threads (cores) we would like to allocate for New Garbage Collection. The general rule says:

total number of cpus / number of jvm / 2

A result lower than two would make no sense, since only one cpu would be used for "parallel" collection, so the minimum parallel gc threads is 2.
Let's do the calculation:

2 = 12 cpu / x / 2 <=> x = 3 - this would mean a maximum number of 3 JVM / Application Server with a minimum of 2 ParallelGC Threads, supposing that at one point in time, all JVM's are under the same load, and are all performing garbage collection.

Any higher number would mean contention over cpu resources in the case that all JVM's are doing parallel garbage collection at the same time.

Of course, reducing the number of GC Threads will show in the form of longer garbage collection times. After analyzing two garbage collection logs, one with 4 ParallelGCThreads, and one with three, over a prolonged period of time, in a production environment, i could notice the following:
- average duration of scavenge gc increased by about 55 % when reducing the number of threads to 3 (from 4)
- percentage of time in scavange gc increased by 34 % when reducing the number of threads to 3 (from 4)
- average interval increased by 15 % when reducing the number of threads to 3 ( from 4)

These results have been collected by running only one JVM in production mode, by reducing the number of gc threads only. Since i am interested in keeping the scavenge garbage collection times under half a second, i am afraid that further reducing the number of garbage collection threads would only degrade GC performance.

Of course, one can say that the chances of all JVM's running GC at the same time are quite low, but with a rate of one GC every 40 seconds or so, am i willing to take the risk?

I was thinking of recommending increasing the number of CPU's to 24. That would allow:
2= 24 cpu / x / 2 <=> x = 6 JVM Instances

With a shared L2 Cache per cpu, that might just fly...

I'd be happy to hear your opinions and experiences about this one...

Alex

Alexandru Ersenie Alexandru Ersenie Moderator

Hi Fabian. Thanks for giving this problem your thoughts.
I guess we are, at some point, speaking of different things. But let's take them one by one.

With a configured EDEN of 4 GB, and 4 Garbage Collection threads (let us call them cores) doing the work, the average Young Garbage Collection Time (so called scavenge from now on, for the ease of conversation) takes about 300 ms. I have configured a maximum tenuring of 4 and increased target survivor ratio to 90, so i guess it has therefore a little more overhead, for checking the generation level of the already parsed, but still live objects. The CPU is a six core cpu, with a speed of 2600 Mhz. You mean 500 ms is a lot. Is it still a lot based on this info? I do not have any benchmarks i can count on, unfortunately

Regarding the system load: this behavior occurs while having a CPU Average load of 10 - 20 %. Simple calculation would mean that if 4 of my 12 cores are busy doing GC, the other 8 would wait anyways for the time interval of 300 ms, since the ParNewGC still stops the world, but uses multiple threads for doing the copying process.

That means: if i had three out of four JVM's doing Young GC at the same time, the fourth JVM would suspend? (did not try that, just guessing)

Increasing the load on the system would surely reduce the garbage collection interval, so i guess having a load of 40 - 50 % brings me to having a garbage collection every 20 - 30 seconds...

Further increasing this to a load of let's say 70 % brings me to having a garbage collection every 10-20 seconds. The chances of things going wrong in this case are severely increased, so i'd try to make sure that no JVM is affected because of lack of CPU resources.

I do not think the number of current busy threads matters anymore when Young GC is performed, since they will be suspended for a short while anyway, right?

When you mean parallel app threads, you mean the http thread pool threads, right?

Thanks a lot.

Alex