Friday, July 31, 2015

Attempting to understand java garbage collect statistics

If you have been develop large java application, at times troubleshooting application can go as deep as looking into garbage collector when application is running. Unfortunately the statistics are just too much to begin to investigate into or trying to understand it. At least for me, it is pretty mundane and I seek your help too if you came across this article and please leave comment.

There are very few documentation describe how are these statistics should be interpreted. There is this from oracle blog which is dated year 2006, pretty outdated to be relevant but nonetheless, it analyze line by line. More recent article from alexy ragozin and poonam bajaj are worth to take a look too.

The gc statistics should be able to regenerate using these parameter to the java command line.  -XX:+PrintGCDetails -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=1 , and the following are snippets extracted from a production machine. Let's take a look at them line by line.

 Before GC:  
 Statistics for BinaryTreeDictionary:  
 ------------------------------------  
 Total Free Space: 230400  
 Max  Chunk Size: 230400  
 Number of Blocks: 1  
 Av. Block Size: 230400  
 Tree   Height: 1  
 586945.492: [ParNew  
 Desired survivor size 41943040 bytes, new threshold 1 (max 1)  
 - age  1:  10038008 bytes,  10038008 total  
 : 660426K->10292K(737280K), 0.0353470 secs] 9424156K->8774094K(12500992K)After GC:  
 Statistics for BinaryTreeDictionary:  
 ------------------------------------  
 Total Free Space: 127053189  
 Max  Chunk Size: 21404293  
 Number of Blocks: 125654  
 Av. Block Size: 1011  
 Tree   Height: 36  
   
   
   
 After GC:  
 Statistics for BinaryTreeDictionary:  
 ------------------------------------  
 Total Free Space: 230400  
 Max  Chunk Size: 230400  
 Number of Blocks: 1  
 Av. Block Size: 230400  
 Tree   Height: 1  
 , 0.0359540 secs] [Times: user=0.26 sys=0.00, real=0.03 secs]   
 Heap after GC invocations=550778 (full 2090):  
  par new generation  total 737280K, used 10292K [0x00000004fae00000, 0x000000052ce00000, 0x000000052ce00000)  
  eden space 655360K,  0% used [0x00000004fae00000, 0x00000004fae00000, 0x0000000522e00000)  
  from space 81920K, 12% used [0x0000000522e00000, 0x000000052380d360, 0x0000000527e00000)  
  to  space 81920K,  0% used [0x0000000527e00000, 0x0000000527e00000, 0x000000052ce00000)  
  concurrent mark-sweep generation total 11763712K, used 8763801K [0x000000052ce00000, 0x00000007fae00000, 0x00000007fae00000)  
  concurrent-mark-sweep perm gen total 40952K, used 24563K [0x00000007fae00000, 0x00000007fd5fe000, 0x0000000800000000)  
 }  
 Total time for which application threads were stopped: 0.0675660 seconds  
 {Heap before GC invocations=550778 (full 2090):  
  par new generation  total 737280K, used 11677K [0x00000004fae00000, 0x000000052ce00000, 0x000000052ce00000)  
  eden space 655360K,  0% used [0x00000004fae00000, 0x00000004faf5a220, 0x0000000522e00000)  
  from space 81920K, 12% used [0x0000000522e00000, 0x000000052380d360, 0x0000000527e00000)  
  to  space 81920K,  0% used [0x0000000527e00000, 0x0000000527e00000, 0x000000052ce00000)  
  concurrent mark-sweep generation total 11763712K, used 8763801K [0x000000052ce00000, 0x00000007fae00000, 0x00000007fae00000)  
  concurrent-mark-sweep perm gen total 40952K, used 24563K [0x00000007fae00000, 0x00000007fd5fe000, 0x0000000800000000)  

We can summarize the statistics above with the following points.

* the statistics generated above is from java hotspot and the source code can be foudn here https://github.com/openjdk-mirror/jdk7u-hotspot/blob/master/src/share/vm/gc_implementation/concurrentMarkSweep/binaryTreeDictionary.cpp#L1098-L1112

* there are two statistics, before gc and after gc and this is not full gc.

* before gc, we notice the max chunk size is equal to the total free space, so we assume there is no usage.

* before gc, we also noticed that the total free space has 127053189 and max chunk size is 21404293

* after gc, cpu usage is spent on user 0.26 and real 0.03.

* after gc, from region usage 12% of the heap.

* after gc, concurrent mark sweep generation total of 11,763,712k whilst concurrent mark sweep permanent generation total is 40,954k and used only 24,563K

* total time this application stop were 0.0675660 seconds.

So we can guess that this gc snippet is good. it is not a full gc and usage does not increase to 100%. There is no failure/error appear anywhere. The total time stop is trivial too, less than a second.

That's it and if you think this analysis is wrong and/or can be improve upon, please leave your message below. I would like to learn more too.

No comments:

Post a Comment