Programming codex

CPU Maxes Out and Utilization Never Goes Down

CPU Maxes Out and Utilization Never Goes Down


Have you ever encountered a circumstance where your applications CPU maxes out and never goes down even if traffic volume goes down? Did you have to recycle to JVM to remediate the problem? Even if you recycle the JVM, does your CPU start to spike up after some time?

This type of problem surfaces because of one of the following reasons:

Repeated Full GC

Non-terminating Loops

non-synchronized access to java.util.HashMap

Let’s see how to diagnose these scenarios and address them.

Scenario 1: Repeated Full GC

Full GC is an important phase of the Garbage Collection process. During this phase, the entire JVM is frozen, every single object in the memory is evaluated for garbage collection, naturally, it turns out to be a CPU intensive operation. If the application happens to have a memory leak, then “Full GC” will start to run repeatedly without reclaiming any memory. When ‘Full GC’ runs repeatedly, CPU will start to spike up and never come down.

Tactical Solution: To resolve the problem completely, a memory leak in the application has to be fixed. Resolving memory leaks might take some time. (Of course, you can engage experts like resolve it quickly). Until then the below mentioned tactical solution can be implemented to keep the application functioning in production. You need to instrument a script that would monitor the garbage collection log file of the application every 2 minutes. If the script notices more than 3 ‘Full GC’ runs in a 10-minute window, then that particular JVM should be decommissioned from taking production traffic. JVM should be recycled after capturing thread dump and heap dump. After recycling JVM should be placed back to take active traffic.

Strategic Solution: Using the Heap Dump/Thread Dump root cause of the problem should be identified & fixed.

Scenario 2: non-terminating loops

Sometimes due to bug in your code or the 3rd party library that you use – loop constructs (while, for, do.while) may run forever. Consider the scenario below:

Due to certain data conditions or bugs in the code, ‘condition’ may never get satisfied. In such a scenario, the thread would be spinning infinitely in the while loop. This would cause the CPU to spike up. Unless JVM is restarted, CPU maxing out wouldn’t stop at all.

Solution: When you observe CPU maxing out and utilization not coming to go down, you should take 2 thread dumps in a gap of 10 seconds between each thread dump – right when the problem is happening. Every thread in the “runnable” state in the first taken thread dump should be noted down. The same threads state in the second thread dump should be compared. If in the second thread dump also those threads remain the runnable state within the same method, then it would indicate in which part of the code thread(s) is looping infinitely. Once you know which part of the code is looping infinitely then it should be trivial to address the problem.

Scenario 3: non-synchronized access to java.util.HashMap

When multiple threads try to access HashMap’s get() and put() APIs concurrently it would cause threads to go into infinite looping. This problem doesn’t happen always, but rarely it does happen.

Solution: When you observe CPU maxing out and utilization not coming to go down, you should take a thread dump – right when the problem is happening. You need to see which are threads that are in the “runnable” state. If that thread happens to be working on HashMap’s get() or put() API, then it’s indicative that HashMap is causing CPU spike. Now you can replace that HashMap with ConcurrentHashMap.


Source by Stephen James

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest
Share on whatsapp

Leave a Reply

Your email address will not be published. Required fields are marked *



Recent Posts