0

I am trying decompress a large file of size about 1 GB and I cannot use the file output stream method. My final document requires byte array of the decompressed file to create an new file. For now I have manually been growing array size for each read. But this is too slow for large files. Is there any way I can get efficiency in this method.

      if (primaryDocumentInputStream != null) {
  byte[] tempbuffer = new byte[536870912];
  byte[] mainbuffer = new byte[536870912];
  int lenMainBuffer = 0;
  try {
    int aIntBuffer = aGZIPInputStream.read(tempbuffer);
    while (aIntBuffer > 0) {
      byte[] copyBuffer = new byte[lenMainBuffer + aIntBuffer];
      System.arraycopy(mainbuffer, 0, copyBuffer, 0, lenMainBuffer);
      System.arraycopy(tempbuffer, 0, copyBuffer, lenMainBuffer, aIntBuffer);
      mainbuffer = copyBuffer;
      aIntBuffer = aGZIPInputStream.read(tempbuffer);
      lenMainBuffer = mainbuffer.length;
    }
    primaryDocumentOutputDocument.setBody(mainbuffer);
    wfc.putPrimaryDocument(primaryDocumentOutputDocument);

  }
1
  • You can keep a list of buffers and do only one allocation/copy at the end. Or you use a larger initial buffer (maybe using the known expanded size). But ultimatively the method you are calling which expects a single big byte array needs a redesign. Commented Apr 10, 2016 at 17:10

1 Answer 1

3

Write your data into a ByteArrayOutputStream. It wraps an array of bytes and resizes it when needed. When done, calling toByteArray returns the bytes.

One difference between ByteArrayOutputStream and what you have written here is that typical implementations double the size of the backing array, means that writing n bytes has O(n) amortized time complexity. If you grow the array by fixed increments like here, you'll get O(n^2) time complexity.

Sign up to request clarification or add additional context in comments.

3 Comments

More efficient than what, doing the same thing it does but with code you write yourself? No, it will be about as efficient, unless you make mistakes and end up writing something that's worse.
More efficient than what my code in doing in the question. Thank you for your help. I updated the code this way but I get this error: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space ByteArrayOutputStream data = new ByteArrayOutputStream(); try { int aIntBuffer = gZIPInputStream.read(buffer); while (aIntBuffer>0) { data.write(buffer); }
That means that the JVM does not have enough memory to hold all the decompressed data. If the file has 1GB of data you should increase the heap size to at least 3 or 4GB. The previous comment was a response to another person who has since deleted their comments by the way.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.