14

In my code, which often runs on servers I do not control the configuration of, I have collection of users and each user has a byte[] array.

Sometimes these byte[] arrays are unique to the user. Often, though, there will be large numbers users with the exact same byte[] array.

I am trying to reduce the RAM consumption of my server.

I've tried turning my byte[] arrays into strings and interning them, but then I often run into PERM-GEN out-of-memory errors. I also see a significant performance degradation with the encoding/decoding when I want to access the byte[] array for a user, and I see much increased worse-case memory usage - presuambly strings are much larger than arrays.

How can I have a Set<SoftReference<byte[]>> lookup when Java arrays are not hashable and SoftReferences do not wrap the hash of the object the point at either. A Map<byte[],SoftReference<byte[]>> is obviously also defeating itself because the key is itself and prevents collection; and Set is internally implemented in terms of Mapanyway.

So how can I intern byte[] arrays?

4
  • One thing that comes to mind is the Flyweight pattern. Also have a look at stackoverflow.com/questions/1058149/… Commented Jun 12, 2013 at 12:24
  • I think you'll have to wrap your byte arrays, effectively boxing them e.g. with new ByteArray(byte [] theBytes) and never make additional long-lived references to theBytes outside of the ByteArray itself. Then softreferences to the ByteArrays will work correctly. You might look at WeakHashMap<ByteArray, User> for your application as well. Commented Jun 12, 2013 at 12:40
  • How large are these byte arrays? How does the user get hold of one? Commented Jun 12, 2013 at 12:48
  • @fge they arrive over the network, and are 'room' keys - variable length, sometimes short, sometimes 1K or more - on another server system. Sometimes lots of users are in the same 'room', other times they have their own and so on. Commented Jun 12, 2013 at 12:50

3 Answers 3

5

If you effectively have many identical arrays, use an HashSet<ByteBuffer> as a cache. You can get the ByteBuffer array with method array() and the ByteBuffer class has hashCode and equals methods. Of course it is better if your arrays are immutable.

EDIT2 The comment from @Will is exact, to be able to get back the array, use a WeakHashMap<ByteBuffer,WeakReference<ByteBuffer>> and do something like that :

public byte[] internalize(byte[] bytes) {
 ByteBuffer wrapped = ByteBuffer.wrap(bytes);
 if(cache.containsKey(wrapped)) {
  wrapped = cache.get(wrapped).get();
 }
 else {
  cache.put(wrapped, new WeakReference<ByteBuffer>(wrapped);
 }
 return wrapped.array();
}
Sign up to request clarification or add additional context in comments.

8 Comments

How do I get them collected from the cache when the system needs memory and there are no users at that time referencing them?
A WeakHashMap<ByteBuffer, Void> might communicate the intent even more clearly. You could store nulls as the values.
Although if you use a WeakHashMap<ByteBuffer, Boolean>, then you can wrap it using Collections#newSetFromMap to get a Set<ByteBuffer>, which leaves your code cleaner.
@Oak: My gripe with Map<E, Boolean> is that it lets an element be in one of three states: not in the map, mapped to true, and mapped to false. We only need two states, so there is redundancy. That doesn't present a practical problem, but it is inelegant. Void only has one possible value (null), so there are only two possible states.
@TomAnderson how can you get the key value itself from a Map in O(1)? We need to retrieve what we are caching?
|
2

I've tried turning my byte[] arrays into strings and interning them, but then I often run into PERM-GEN out-of-memory errors.

I agree that you need something like String.intern(), but the standard implementation is native, so not much joy.

You could have a Map<Integer,Collection<SoftReference<byte[]>>>, using the hash code of the byte arrays as the Map key. Your intern method could then look up the set of existing byte arrays with the same has code as the given byte arrays. With a good hash code that should give a small set of arrays to examine for a match.


Edit: To clarify:

Something like this:

 class ByteArrayCache
 {
      private final Map<Integer,Collection<SoftReference<byte[]>> map = new ...;

      public final byte[] intern(byte[] byteArray)
      {
           final int hash = Arrays.hashCode(byteArray);
           final Collection<SoftReference<byte[]>> arrays = map.get(hash);
           if (arrays != null) {
              // Search through arrays for a match, and return the match.
              // If no match found, add byteArray to the collection and return it
           } else {
              // create a new map entry, add byteArray to it, and return byte array
           }
      }
 }

5 Comments

How can I vacuum up orphaned Integer and Set when they are not used? And what's the RAM overhead of so much boxing in the worst case when there isn't a lot of duplication?
If the arrays are large and sparse, first bytes of md5 may be a useful hash. If they are small, roll-you-own will probably suffice.
@tucuxi Arrays.hashCode(byte[]) to the rescue :) But the fundemental problem is you can't subclass byte[] to patch that in
@Will Use a HashSet<WrappedByteArray> instead? Avoids boxing the hash key; but you would need to wrap those byte arrays, to overwrite hashCode().
No need to override hashCode(). It's a Map, not a Set.
1

I would implement a cache based on Guava weak values map. It guarantees that if theres no more strong references to byte array the entry will be automatically removed.

class Cache {
    private final ConcurrentMap<Key, byte[]> map = new MapMaker().weakValues().makeMap();

    private static class Key {
        byte[] a;
        int hash;

        Key(byte[] a) {
            this.a = a;
            hash = Arrays.hashCode(a);
        }

        @Override
        public int hashCode() {
            return hash;
        }

        @Override
        public boolean equals(Object obj) {
            if (obj instanceof Key) {
                return Arrays.equals(a, ((Key) obj).a);
            }
            return false;
        }
    }

    public byte[] intern(byte[] a) {
        byte[] a1 = map.putIfAbsent(new Key(a), a);
        if (a1 != null) {
            return a1; 
        }
        return a;
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.