17

I have an S3 bucket containing quite a few S3 objects that multiple EC2 instances can pull from (when scaling horizontally). Each EC2 will pull an object one at a time, process it, and move it to another bucket.

Currently, to make sure the same object isn't processed by multiple EC2 instances, my Java app renames it with a "locked" extension added to its S3 object key. The problem is that "renaming" is actually doing a "move". So the large files in the S3 bucket can take up to several minutes to complete its "rename", resulting in the locking process being ineffective.

Does anyone have a best practice for accomplishing what I'm trying to do?

I considered using SQS, but that "solution" has its own set of problems (order not guaranteed, possibility of messages delivered more than once, and more than one EC2 getting the same message)

I'm wondering if setting a "locked" header would be a quicker "locking" process.

3
  • I did not post an answer because I do not have one for the question about "locking" an S3 file. However, perhaps an alternative would be to not use concurrency at all, and have processes generate their own file, all of which are later concatenated. I am thinking the time you lose in concatenating comes from time saved avoiding concurrency bottlenecks. If these processes will run forever then that could require a system similar to archiving after a file size is reached. Commented Jun 12, 2017 at 6:02
  • Hi @Todd, what was the final solution you chose for this problem? Commented Nov 15, 2022 at 7:38
  • Ams1, it has been a long time since worrying about this so my memory is foggy, but IIRC, I resorted to using an SQS queue to distribute the work. Commented Nov 16, 2022 at 14:54

5 Answers 5

16

Some, but not all, of the original answer, below, contains information that is no longer entirely applicable to Amazon S3, as of December, 2020.

Effective immediately, all S3 GET, PUT, and LIST operations, as well as operations that change object tags, ACLs, or metadata, are now strongly consistent. What you write is what you will read, and the results of a LIST will be an accurate reflection of what’s in the bucket

https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/

However, that enhancement doesn't resolve this concern. There is still an important race condition potential, though it is reduced.

Amazon S3 does not support object locking for concurrent writers. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you must build an object-locking mechanism into your application.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel

The enhancements to S3 that eliminated eventual consistency do not eliminate the problem of concurrent writers -- so you still need a lock mechanism. Also, as noted in the original question, objects in S3 cannot actually be renamed atomically -- they can only be copied internally, atomically to a new object with a different object key, then the old object deleted, so both can exist for a nonzero length of time.

Of course, since the original answer was posted, SQS released FIFO queues which guarantee exactly-once delivery of messages to a properly written application.


order not guaranteed, possibility of messages delivered more than once, and more than one EC2 getting the same message

The odds of actually getting the same message more than once is low. It's merely "possible," but not very likely. If it's essentially only an annoyance if, on isolated occasions, you should happen to process a file more than once, then SQS seems like an entirely reasonable option.

Otherwise, you'll need an external mechanism.

Setting a "locked" header on the object has a problem of its own -- when you overwrite an object with a copy of itself (that's what happens when you change the metadata -- a new copy of the object is created, with the same key) then you are subject to the slings and arrows of eventual consistency.

Q: What data consistency model does Amazon S3 employ?

Amazon S3 buckets in all Regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.

https://aws.amazon.com/s3/faqs/

Updating metadata is an "overwrite PUT." Your new header may not immediately be visible, and if two or more workers set their own unique header (e.g. x-amz-meta-locked: i-12345678) it's entirely possible for a scenario like the following to play out (W1, W2 = Worker #1 and #2):

W1: HEAD object (no lock header seen)
W2: HEAD object (no lock header seen)
W1: set header
W2: set header
W1: HEAD object (sees its own lock header)
W2: HEAD object (sees its own lock header)

The same or a similar failure can occur with several different permutations of timing.

Objects can't be effectively locked in an eventual consistency environment like this.

Sign up to request clarification or add additional context in comments.

7 Comments

TLDR: you need a database to do database things.
Would Amazon Elastic File work better? Or, is it just a different front end with an S3 backend (or at least the same limitations)?
Elastic File System should work much better. It is not a front-end to S3, it is immediately consistent, and it supports actual NFS locking... but it is currently still in preview and only available in us-west-2 (Oregon).
S3 has now immediate consistency. In that case both W1 and W2 would see the W2's header, in which case W2 could technically continue and W1 would have to skip/retry.
@NeverEndingQueue also, thank you for prompting me to review and revise the answer.
|
11

Object tag can assist here, as changing a tag doesn't create a new copy. Tag is kind of key/value pair associated to object. i.e. you need to use object level tagging.

1 Comment

Thanks Tejas, that is a brilliant solution!
4

S3 consistency model has since changed. Now it supports strong read-after-write consistency for both new object and an overwrite of object. https://aws.amazon.com/s3/faqs/

After a successful write of a new object or an overwrite of an existing object, any subsequent read request immediately receives the latest version of the object.

Comments

0

Have you considered using a FIFO Queue for your usecase. Instead of best-effort ordering, a FIFO queue maintains the order of the messages from when they are sent to the queue to when they are polled. You can also ensure your objects are processed only once since Deduplication allows for exactly once processing.

Comments

0

I think I've figured out a method to create an advisory "lock file" in S3. Though I'd love feedback if I'm missing a race condition somewhere.

This method requires that versioning is enabled in the S3 bucket.

When you want to create a "lock", first do a PutObject request to a key you want to act as a lock. I usually add .lock to the end of my keys to make sure nothing else would assign that key. When your PutObject request succeeds, store the VerionId and LastModified responses associated with that newly created object (version). Then immediately run a ListObjectVersions request with the key as the Prefix to get all the existing versions of that object. Check to see what versions have an older LastModified datetime than the one you created. If older versions do exist, then you can't get the lock (the file is already locked by another process). If not, then you've got the lock. To release the lock, simply delete that version using DeleteObject.

The only race condition that I can think of could be due to the LastModified datetimes. They have a resolution of 1 ms, so theoretically two versions could have the same LastModified datetimes. I would be impressed if this could happen or actually cause an issue with an implementation.

This method can also allow for differentiating "shared" vs "exclusive" locks. During the PutObject request, use a consistent bytes body for "shared" vs "exclusive" locks so that the resulting md5 (ETag) is also consistent for those two lock states. In python I use b'0' for shared, and b'1' for exclusive. Whether there's much utility in having these two lock states is another question ;)

My only other caveat is that I've only tested this on Backblaze (because it's much cheaper).

I have actually implemented this in Python, but just recently. Consequently, I haven't created a docs page for it. If people are interested I can post a link and a short tutorial.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.