Thursday, July 5, 2012

JAVA Garbage Collection and JVM Arguments


This post is about various JVM arguments available to tune Garbage Collection. Its not an exhaustive list but contains almost all important parameters you will ever need to tune your application. Once we are done with the arguments, we will see how to analyze the gc logs.


 I will take what I want !!

  We can choose which garbage collector we want for our application. Following are the available
   options.
  1. -XX:+UseSerialGC                         -Use serial GC
  2. -XX:+UseParallelGC                      -Use Parallel GC
  3. -XX:+UseParallelOldGC                -Use Parallel Compacting GC
  4. -XX:+UseConcMarkSweepGC      -Use Concurrent Mark Sweep(CMS) 

  Whats Going On??

  Choosing the garbage collector is one thing but question is what its doing behind??Let's hear it from   
  the collector himself. We can enable printing of details of garbage collection every time it happens
  using following parameters.

  1. -XX:+PrintGC                                   -Print basic details every time garbage collection happens
  2. -XX:+PrintGCDetails                        -Print more detailed information for each gc.
  3. -XX:+PrintGCTimeStamps              -Prints a timestamp at start of each gc.
  4. -Xloggc: <filepath>                             -Print the details in given log file.

   Distribute(Be Generous but don't waste)

   Okay!!! So we have selected our collector, we have enabled the logs but we haven't done anything
   about the memory yet, heart of the GC. Lets tune it. 
  1. -Xmsn : You can specify the initial heap size to be allocated. n is number of bytes eg -Xms512m  (m for MB)
  2. -Xmxn: The maximum size of heap in bytes eg. -Xmx1024m
  3. -XX:MinHeapFreeRatio : Minimum percentage of free space of total heap size. Suppose minimum is fixed to 20%, when percentage of free space in any generation reaches 20% then size of generation is expanded to maintain the ratio.  eg -XX:MinHeapFreeRatio=40
  4. -XX:MaxHeapFreeRatio : Maximum percentage of free space of total size. If in any generation free space exceeds this value then heap is shrunk to maintain it.
  5. -XX:NewSize : Default initial size of new young generation eg. -XX:NewSize=128m
  6. -XX:NewRatio : Ratio between young and old generation eg -XX:NewRatio=3
  7. -XX:SurvivorRatio :  Ratio between survivor space and eden space -XX:newSurvivorRatio=7
  8. -XX:MaxPermSize : Maximum size of permanent generation.

Other than these parameters there are some which are specific to type of collectors, we are not talking about them right now. Now next thing is how and when we setup these parameters. Answer is before starting your application you need to set these parameters.

On Unix machines you can use:
        export JAVA_OPTS="<argumrnt> <value>, .... "
        eg :  export JAVA_OPTS="-Xms1024m -Xmx2048m -XX:NewSize=256m"

Usually every java application, servers have start script, there also you can put the JAVA_OPTS values.

!!!!!!!!!!!!!!!!!!!!!!!!!!!ENOUGH THEORY, LETS DO SOME PRACTICAL !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


If you have any java application, you can use it. I am using apache geronimo app server which is hosting a sample daytrader application(developed by IBM, later donated to apache. Used for benchmarking of geronimo app server). I am generating load through apache jmeter. I used following 
JAVA_OPTS to start my app server. 

JAVA_OPTS ="-Xms128m -Xmx256m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc=
/home/administrator/gclog.log -XX:MaxPermSize=128m"

I generated a small load on server, this was logged in the gclog file.

19.542: [GC [PSYoungGen: 36486K->5213K(53632K)] 44251K->15877K(141056K), 0.0113620 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 
20.802: [GC [PSYoungGen: 53597K->3797K(65728K)] 64261K->17705K(153152K), 0.0182770 secs] [Times: user=0.04 sys=0.00, real=0.02 secs] 
22.633: [GC [PSYoungGen: 61589K->1652K(79360K)] 75497K->19223K(166784K), 0.0090320 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 
26.096: [GC [PSYoungGen: 73268K->3566K(75200K)] 90839K->22617K(162624K), 0.0160600 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 

The format of log is :
<Time_stamp_since_start_of_server> : [GC [<Collector> : <starting_occupancy_young_gen> -> <ending_occupancy_young_gen>(max_young_generation_size)]<starting_heap_occupancy> -> <ending_heap_occupancy> <time taken>

NOTE : The above log was generated for JDK1.6 and the format changes with every version.

So taking the first line from log we can say that Using Parallel scavenge young generation garbage collection happened after 19.542 seconds from starting of the server, initially 36 MB was being used after gc only 5 Mb is being used in young generation, while initially 44MB was being used in old generation but after GC only 15MB is used and the GC took 0.01 sec to complete.


So now we know how to select collector, what all params are there and how can we see what actually is 
going on behind the scene. Stay tuned for next post where we will look at different profiling tools available and try to learn what should be the values of parameters and how can we find the optimal values for them.


No comments:

Post a Comment