I have tried various kinds of codes to convert a large CSV file (~300 MB) to byte[] but each time it fails giving Java Heap Space error as shown below:
184898 [jobLauncherTaskExecutor-1] DEBUG org.springframework.batch.core.step.tasklet.TaskletStep - Rollback for Error: java.lang.OutOfMemoryError: Java heap space 185000 [jobLauncherTaskExecutor-1] DEBUG org.springframework.transaction.support.TransactionTemplate - Initiating transaction rollback on application exception java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuffer.append(StringBuffer.java:237) at org.apache.log4j.helpers.PatternParser$LiteralPatternConverter.format(PatternParser.java:419) at org.apache.log4j.PatternLayout.format(PatternLayout.java:506) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310) at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.slf4j.impl.Log4jLoggerAdapter.log(Log4jLoggerAdapter.java:601) at org.apache.commons.logging.impl.SLF4JLocationAwareLog.debug(SLF4JLocationAwareLog.java:133) at org.apache.http.impl.conn.Wire.wire(Wire.java:77) at org.apache.http.impl.conn.Wire.output(Wire.java:107) at org.apache.http.impl.conn.LoggingSessionOutputBuffer.write(LoggingSessionOutputBuffer.java:76) at org.apache.http.impl.io.ContentLengthOutputStream.write(ContentLengthOutputStream.java:119) at org.apache.http.entity.ByteArrayEntity.writeTo(ByteArrayEntity.java:115) at org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98) at org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108) at org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122) at org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271) at org.apache.http.impl.conn.AbstractClientConnAdapter.sendRequestEntity(AbstractClientConnAdapter.java:227) at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:712) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:517) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
So far, I have tried using the following versions of code for doing the file to byte[] conversion:
Version 1: Core Java
File file = new File(fileName);
FileInputStream fin = null;
byte fileContent[] = null;
try {
fin = new FileInputStream(file);
fileContent = new byte[(int) file.length()];
fin.read(fileContent);
} catch (FileNotFoundException e) {
System.out.println("File not found" + e);
} catch (IOException ioe) {
System.out.println("Exception while reading file " + ioe);
} finally {
try {
if (fin != null) {
fin.close();
}
} catch (IOException ioe) {
System.out.println("Error while closing stream: " + ioe);
}
}
return fileContent;
Version 2: Java 7 NIO
Path path = Paths.get(fileName);
byte[] data = null;
try {
data = Files.readAllBytes(path);
} catch (IOException e) {
e.printStackTrace();
}
return data;
Version 3: Apache Commons IO
File file = new File(fileName);
FileInputStream fis = null;
byte fileContent[] = null;
try {
fis = new FileInputStream(file);
fileContent = IOUtils.toByteArray(fis);
} catch (FileNotFoundException e) {
System.out.println("File not found" + e);
} catch (IOException ioe) {
System.out.println("Exception while reading file " + ioe);
} finally {
try {
if (fis != null) {
fis.close();
}
} catch (IOException ioe) {
System.out.println("Error while closing stream: " + ioe);
}
}
return fileContent;
Version 4: Google Guava
File file = new File(fileName);
FileInputStream fis = null;
byte fileContent[] = null;
try {
fis = new FileInputStream(file);
fileContent = ByteStreams.toByteArray(fis);
} catch (FileNotFoundException e) {
System.out.println("File not found" + e);
} catch (IOException ioe) {
System.out.println("Exception while reading file " + ioe);
} finally {
try {
if (fis != null) {
fis.close();
}
} catch (IOException ioe) {
System.out.println("Error while closing stream: " + ioe);
}
}
return fileContent;
Version 5: Apache.commons.io.FileUtils
File file = new File(fileName);
byte fileContent[] = null;
try {
fileContent = org.apache.commons.io.FileUtils.readFileToByteArray(file);
} catch (FileNotFoundException e) {
System.out.println("File not found" + e);
} catch (IOException ioe) {
System.out.println("Exception while reading file " + ioe);
}
return fileContent;
I have even setup my Heap Space settings to be quite big. It’s about 6 GB (5,617,772 K) for my external Tomcat as shown in the memory consumption in the Task Manager.
For the first three versions of code the heap space increases suddenly to more than 5 GB upon hitting this byte[] generation code and then it fails. With Google Guava, it seemed very promising and the memory consumption stayed to about 3.5 GB for quite some time, like about 10 minutes, after hitting the byte[] generation code and then it too suddenly jumped to more than 5 GB and failed.
I am unable to figure out a solution for this problem. Can somebody help me solve this problem? Any help in this would be greatly appreciated.
org.apache.http.wireto do that), but what you really should be doing is switch to a streaming implementation of your HTTP request (use aContentProducerand a customEntityTemplateinstead of aByteArrayEntity).