15

I am trying to upload large files to a s3 bucket using the node.js aws-sdk.

the V2 method upload integrally uploads the files in a multipart upload.

I want to use the new V3 aws-sdk. What is the way to upload large files in the new version? The method PutObjectCommand doesn't seem to be doing it.

I've seen there are methods such as CreateMultiPartUpload but I can't seem to find a full working example using them.

Thanks in advance.

3 Answers 3

18

As of 2021, I would suggest using the lib-storage package, which abstracts a lot of the implementation details.

Sample code:

import { Upload } from "@aws-sdk/lib-storage";
import { S3Client, S3 } from "@aws-sdk/client-s3";

const target = { Bucket, Key, Body };
try {
  const parallelUploads3 = new Upload({
    client: new S3({}) || new S3Client({}),
    tags: [...], // optional tags
    queueSize: 4, // optional concurrency configuration
    partSize: 5MB, // optional size of each part
    leavePartsOnError: false, // optional manually handle dropped parts
    params: target,
  });

  parallelUploads3.on("httpUploadProgress", (progress) => {
    console.log(progress);
  });

  await parallelUploads3.done();
} catch (e) {
  console.log(e);
}

Source: https://github.com/aws/aws-sdk-js-v3/blob/main/lib/lib-storage/README.md

Sign up to request clarification or add additional context in comments.

6 Comments

For some reason, my app's memory only increase with this approach. I couldn't find how to use the v3 in a good and performatic way.
I've had a good experience using this approach so far. Thanks for the answer.
@RhadamezGindriHercilio have to agree with you here, I'm seeing memory increase over time when bulk uploading. Did you find an alternative solution by chance?
@JordanLewallen, actually yes, but I end up with another solution that did the same thing much better, I can show you.
@RhadamezGindriHercilio would LOVE to see a gist or link for a solution. Been struggling to fix this. Thank you!!
|
6

Here's what I came up with, to upload a Buffer as a multipart upload, using aws-sdk v3 for nodejs and TypeScript.

Error handling still needs some work (you might want to abort/retry in case of an error), but it should be a good starting point... I have tested this with XML files up to 15MB, and so far so good. No guarantees, though! ;)

import {
  CompleteMultipartUploadCommand,
  CompleteMultipartUploadCommandInput,
  CreateMultipartUploadCommand,
  CreateMultipartUploadCommandInput,
  S3Client,
  UploadPartCommand,
  UploadPartCommandInput
} from '@aws-sdk/client-s3'

const client = new S3Client({ region: 'us-west-2' })

export const uploadMultiPartObject = async (file: Buffer, createParams: CreateMultipartUploadCommandInput): Promise<void> => {
  try {
    const createUploadResponse = await client.send(
      new CreateMultipartUploadCommand(createParams)
    )
    const { Bucket, Key } = createParams
    const { UploadId } = createUploadResponse
    console.log('Upload initiated. Upload ID: ', UploadId)

    // 5MB is the minimum part size
    // Last part can be any size (no min.)
    // Single part is treated as last part (no min.)
    const partSize = (1024 * 1024) * 5 // 5MB
    const fileSize = file.length
    const numParts = Math.ceil(fileSize / partSize)

    const uploadedParts = []
    let remainingBytes = fileSize

    for (let i = 1; i <= numParts; i ++) {
      let startOfPart = fileSize - remainingBytes
      let endOfPart = Math.min(partSize, startOfPart + remainingBytes)

      if (i > 1) {
        endOfPart = startOfPart + Math.min(partSize, remainingBytes)
        startOfPart += 1
      }

      const uploadParams: UploadPartCommandInput = {
        // add 1 to endOfPart due to slice end being non-inclusive
        Body: file.slice(startOfPart, endOfPart + 1),
        Bucket,
        Key,
        UploadId,
        PartNumber: i
      }
      const uploadPartResponse = await client.send(new UploadPartCommand(uploadParams))
      console.log(`Part #${i} uploaded. ETag: `, uploadPartResponse.ETag)

      remainingBytes -= Math.min(partSize, remainingBytes)

      // For each part upload, you must record the part number and the ETag value.
      // You must include these values in the subsequent request to complete the multipart upload.
      // https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html
      uploadedParts.push({ PartNumber: i, ETag: uploadPartResponse.ETag })
    }

    const completeParams: CompleteMultipartUploadCommandInput = {
      Bucket,
      Key,
      UploadId,
      MultipartUpload: {
        Parts: uploadedParts
      }
    }
    console.log('Completing upload...')
    const completeData = await client.send(new CompleteMultipartUploadCommand(completeParams))
    console.log('Upload complete: ', completeData.Key, '\n---')
  } catch(e) {
    throw e
  }
}

4 Comments

Thanks for your reply. Unfortunately, I'm getting the same MalformedXML error. Also, when I run your code I still get the same ETag for each part
Actually, all ETags are identical apart from the first and last one.
I get Etag undefined...which occurs the "The XML you provided was not well-formatted..." error on CompleteMultipartUploadCommand. other than that it seems working nicely.
0

Here is the fully working code with AWS SDK v3

import { Upload } from "@aws-sdk/lib-storage";
import { S3Client, S3 } from "@aws-sdk/client-s3";
import { createReadStream } from 'fs';

const inputStream = createReadStream('clamav_db.zip');
const Bucket = process.env.DB_BUCKET
const Key = process.env.FILE_NAME
const Body = inputStream

const target = { Bucket, Key, Body};
try {
  const parallelUploads3 = new Upload({
    client: new S3Client({
      region: process.env.AWS_REGION,
      credentials: { accessKeyId: process.env.AWS_ACCESS_KEY, secretAccessKey: process.env.AWS_SECRET_KEY }
    }),
    queueSize: 4, // optional concurrency configuration
    partSize: 5242880, // optional size of each part
    leavePartsOnError: false, // optional manually handle dropped parts
    params: target,
  });

  parallelUploads3.on("httpUploadProgress", (progress) => {
    console.log(progress);
  });

  await parallelUploads3.done();
} catch (e) {
  console.log(e);
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.