Friday, May 31, 2013

Understanding String Table Size in HotSpot

In JDK-6962930[2], it requested that string table size be configurable.  The resolved date of that bug was on 04/25/2011 and it's available in JDK 7.  In another JDK bug[3], it has requested the default size (i.e. 1009) of string table be increased.

In this article, we will examine the following topics:
  • What string table is
  • How to find the number of interned strings in your applications
  • The tradeoff between memory footprint and lookup cost

String Table


In Java, string interning[1] is a method of storing only one copy of each distinct string value, which must be immutable. Interning strings makes some string processing tasks more time- or space-efficient at the cost of requiring more time when the string is created or interned. The distinct values are stored in a string intern pool, which is the string table in HotSpot.

The size of the string table (i.e., a chained hash table) is configurable in JDK 7.  When the overflow chains become long, performance can degrade.  The current default size of string table is 1009 (or 1009 buckets), which is too small for applications that stress the string table.  Note that the string table itself is allocated in native memory but the strings are java objects.

Increasing the size improves performance (i..e, reducing look-up cost) but increases the StringTable size by 16 bytes on 64-bit systems, 8 bytes on 32-bit systems for every additional entry.  For example, changing the default size to 60013 increases the String Table size by 460K on 32 bit systems.

Finding Number of Interned Strings in the Applications


In HotSpot, it provides a product level option named PrintStringTableStatistics which can be used to print hash table statistics[4].  For example, using one of our applications (hereafter will be referred as JavaApp), it prints out the following information:

StringTable statistics:
Number of buckets  : 60013
Average bucket size  : 5
Variance of bucket size : 5
Std. dev. of bucket size: 2
Maximum bucket size  : 17

You can find the above output from your manged server's log file in the WebLogic domain.  Note that we have set the following option:
  • -XX:StringTableSize=60013
So, there are 60013 buckets in the hash table (or string table).

In JDK, there is also a tool named jmap which can be used to find out number of interned strings in your application.  For example, we have found the following information using:

$ jdk-hs/bin/jmap -heap 18974
Attaching to process ID 18974, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.0-b43

using thread-local object allocation.
Parallel GC with 18 thread(s)

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 2147483648 (2048.0MB)
   NewSize          = 1310720 (1.25MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 402653184 (384.0MB)
   MaxPermSize      = 402653184 (384.0MB)
   G1HeapRegionSize = 0 (0.0MB)

Heap Usage:
PS Young Generation

<deleted for brevity>

270145 interned Strings occupying 40429904 bytes.


Therefore, we know there are around 260K interned Strings in the table.

Tradeoff Between Memory Footprint and Lookup Cost


Based on curiosity, we have tried to set the string table size to be 277331 (a prime number) to see how JavaApp performs.  Here are our findings:

  • Average Response Time: +0.75%
  • 90% Response Time: +0.56%

However, the memory footprint has increased:
  • Total Memory Footprint: -1.03%

Finally, here is the hash table statistics based on the new size (i.e., 277331):

StringTable statistics:
Number of buckets       :  277331
Average bucket size     :       1
Variance of bucket size :       1
Std. dev. of bucket size:       1
Maximum bucket size     :       8


The conclusion is that increasing string table size from 60013 to 277331 helps JavaApp's performance a little bit at the expense of larger memory footprint.  In this case, the benefit is minimal, keeping string table size to be 60013 is good enough.

References

  1. String Interning (Wikipedia)
  2. JDK 6962930 : make the string table size configurable
  3. JDK 8009928: Increase default value for StringTableSize
  4. Java GC tuning for strings
  5. All other performance tuning articles on XML and More
  6. G1 GC Glossary of Terms


No comments: