2

I have a Spring boot API and one of the endpoints allows users to upload video's. Now My controller basically takes the file as a MultiPart file and then I store it in a temp folder accessible to tomcat. Once I have it stored on Disk, I then push the video to an S3 bucket.

Now to me anyway, this seems to be less than optimal, Like if I wanted to have a 100 or a 1000 users upload at once it seems really non performant to write the files to disk first.

As a little background I'm storing it on disk with the intention that if there is a issue pushing to S3 I can retry

The below code might show what I'm doing better than the above:

public Video addVideo(@RequestParam("title") String title,
    @RequestParam("Description") String Description,
    @RequestParam(value = "file", required = true) MultipartFile file) {         
           this.amazonS3ClientService.uploadFileToS3Bucket(file, title, description));
}

Method for storing Video file:

String fileNameWithExtenstion = awsS3FileName + "." + FilenameUtils.getExtension(multipartFile.getOriginalFilename());

//creating the file in the server (temporarily)
File file = new File(tomcatTempDir + fileNameWithExtenstion);FileOutputStream fos = new FileOutputStream(file);

fos.write(multipartFile.getBytes());

fos.close();PutObjectRequest putObjectRequest = new PutObjectRequest(this.awsS3Bucket, awsS3BucketFolder + UnigueId + "/" + fileNameWithExtenstion, file);

if (enablePublicReadAccess) {
    putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
}

// Upload a file as a new object with ContentType and title 

specified.amazonS3.putObject(putObjectRequest);
//removing the file created in the server
file.delete();

So my question is....is there a better way in Tomcat to:

A) Take in a file via a controllerB) Push to S3

2 Answers 2

2

There is no other way to do it with multipart. The problem with multipart that to properly segement parts from the requst they need sometimes skipped or be repeatable. That is impossible within memory w/o having memory to explode. Therefore, Commons FileUpload caches them on disk after a certain threshold is reached. Multipart requests are the worst way for that. I highly recommend to use either PUT or POST with content type application/octet-stream. You can take the bare request input stream and pass to HttpClient to stream to your backend server. I did this already 5 years ago and it works for gigabytes. I have posted the solution in the Apache HttpClient mailing list.

There is one possibility how this could work under specific conditions:

  • All parts are in the correct physical order you want to read
  • Your write to a backend is fast enough to sustain the read from the front

Consume the root part and then go over to the next physical one, process the request body lazily. JAX-WS RI (Metro) has a very nice handling of multipart requests for XOP/MTOM. Learn from that because you won't be able to make it any better.

Sign up to request clarification or add additional context in comments.

5 Comments

Would you be able to find link to the mailing list? It sounds interesting.
@nanomader lists.apache.org/thread.html/… It requires a custom/wrapped input stream.
You can still do multipart if that's what you want to do (e.g. if you need to upload both the document and also other form-data at the same time), but you will have to prevent Tomcat from handling the parsing for you. Instead, you'll have to parse the request yourself by reading from the request's InputStream directly, streaming as necessary. The problem is that you can't guarantee that the metadata (e.g. filename) arrives before the payload (file bytes), so you might have to write a lot of "exception" handling code (in the sense of an exceptional workflow, not a thrown Exception).
@Michael-O has a suggestion here that is specific to a certain technology (e.g. JAX-RI) while my comment is just a general explanation for how to do it. But as he suggests: why re-invent the wheel if you don't have to?
@ChristopherSchultz Correct. Learn from JAX-WS RI and use the MIME Parser dependencies w/o any relation to SOAP.
1

Perhaps you can try to direct stream the input stream from your MultipartFile to S3.

Consider the following uploadFileToS3Bucket method:

public PutObjectResult uploadFileToS3Bucket(InputStream input, long size, String title, String description) {
  // Indicate the length of the information to avoid the need to compute it by the AWS SDK
  // See: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html#PutObjectRequest-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-
  ObjectMetadata objectMetadata = new ObjectMetadata();
  objectMetadata.setContentLength(size); // rely on Spring implementation. Maybe you probably also can use input.available()
  // compute the object name as appropriate
  String key = "...";
  PutObjectRequest putObjectRequest = new PutObjectRequest(
    this.awsS3Bucket, key, input, objectMetadata
  );

  // The rest of your code 
  if (enablePublicReadAccess) {
    putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
  }

  // Upload a file as a new object with ContentType and title 

  return specified.amazonS3.putObject(putObjectRequest);
}

Of course, you need to provide the service the input stream obtained from the client request associated with the MutipartFile object:

public Video addVideo(
  @RequestParam("title") String title,
  @RequestParam("Description") String Description,
  @RequestParam(value = "file", required = true) MultipartFile file) { 

  try (InputStream input = file.getInputStream()) {        
    this.amazonS3ClientService.uploadFileToS3Bucket(input, file.getSize(), title, description));
  }
}

Probably you can also play with the getBytes method of MultipartFile and create a ByteArrayInputStream to perform the operation.

In addVideo:

byte[] bytes = file.getBytes();

In uploadFileToS3Bucket:

ObjectMetadata objectMetadata = new ObjectMetadata();
  objectMetadata.setContentLength(bytes.length);
PutObjectRequest putObjectRequest = new PutObjectRequest(
  this.awsS3Bucket, key, new ByteArrayInputStream(bytes), objectMetadata
);

I would prefer the first solution, but try to determine which option offers you the best performance.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.