15

For both on-heap and off-heap allocations. On-heap - in the context of three major garbage collectors: CMS, Parallel Old and and G1.

What I know (or think that I know) to the moment:

  • all object (on-heap) allocations are rounded up to 8 bytes boundary (or larger power of 2, configured by -XX:ObjectAlignmentInBytes.
  • G1
    • For on-heap allocations smaller than the region size (1 to 32 MB, likely around heap size / 2048) there is no internal fragmentation, because there is no need, because the allocator never "fills holes".
    • For allocations larger the region size, it rounds up allocation to the region size. I. e. allocation of the region size + 1 byte is very unlucky, it wastes almost 50% of memory.
  • For CMS, the only relevant information I found is

    Naturally old space PLABs mimic structure of indexed free list space. Each thread preallocates certain number of chunk of each size below 257 heap words (large chunk allocated from global space).

    From http://blog.ragozin.info/2011/11/java-gc-hotspots-cms-promotion-buffers.html. As far as I understand, referred "global space" is the main old space.

Questions:

  • Are the above statements correct?
  • What are the fragmentation properties of the main old space in CMS? What about allocations of more than "257 heap words"?
  • How the old space is managed with Parallel Old GC?
  • Does Hotspot JVM use the system memory allocator for off-heap allocations, or it re-manages it with a specific allocator?

UPD. A discussion thread: https://groups.google.com/forum/#!topic/mechanical-sympathy/A-RImwuiFZE

8
  • 2
    Why do you want to do? Remember this stuff changes from implementation to implementation, and from update to update. If you're trying to optimize, I think an up-to-date article might be your best bet. 2011 was a while ago. Commented Jun 23, 2015 at 17:12
  • Google is your friend (use Search Tools -> Within One Year): March 2015 JVM GC Tunning Guide Commented Jun 23, 2015 at 17:15
  • @markspace I've read this before posting. This guide says nothing about internal fragmentation. Commented Jun 23, 2015 at 17:38
  • Regarding your 4th point, looking at the source, it is fairly easy to find out that, on the current hotspot, they are using a plain malloc to do allocations. The entry point is sun.misc.Unsafe.allocateMemory Commented Jun 29, 2015 at 17:42
  • I'm honestly puzzled: the question has not a simple answer, but please consider that just like @markspace said, you should thread carefully when working with low level theme like fragmentation, arrangement of generations etc... The implementation of these can very wildly from one version to another (major) or slightly from one update to another thus making your implementation a bit of a shot in the dark.. If on the other end, your question is knowledge for the sake of knowledge, I'd like to hear experts opinion on some of your points as well! Commented Jun 30, 2015 at 9:30

1 Answer 1

6
+150
  • As far as I understand, the statements above are correct, although the bit on CMS is missing a lot of context to interpret it.
  • CMS is prone to fragmentation (in its old space, where CMS runs), which is one of its major flaws. If it fragments too much, it may occasionally have to stop the world and do a full mark and (sliding) compaction to remove the fragmentation, which leads to a large pause in the application. It is this flaw that is often cited as why G1 was developed. Some systems (e.g. HBase) purposely do most of their allocations with fixed size blocks in order to prevent or significantly reduce fragmenting CMS to avoid long stop-the-world pauses.
  • ParallelOldGC (or 'Old GC' in general) does not fragment. Objects are tenured to the old heap and when it runs out of space, a full mark and compact cycle is run. It can do this full GC faster than any of the other allocators, but with a typical run time of 1 second per 2 GB of heap, this can be too long for large heaps or latency sensitive applications.
  • Hotspot has used various strategies for off-heap allocation depending on the purpose. Allocating native byte buffers differs from its own allocation for compiled code or profiling data. I can not answer with authority here on any details, but I can only assume that much of this does not use the system allocator, else Hotspot would not perform as well as it does. Furthermore, there are parameters one can tune that control some of this space, e.g. -XX:ReservedCodeCacheSize, which suggests such a region of memory is managed through indirection and not directly via the system allocator. In short I would be rather surprised if the system allocator was directly used for any fine-grained allocation at all in hotspot.
Sign up to request clarification or add additional context in comments.

1 Comment

You address external fragmentation mostly. My question is about internal fragmentation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.