3

at the moment there's some files being uploaded where they are getting corrupted. They'll have a filesize of 0 bytes. May I ask how do I query my s3 bucket and filter by specific size, i'm trying to query when byte is 0?

enter image description here

At the moment I have two queries.

First one list all the files recursively in the bucket but no sorting.

aws s3 ls s3://testbucketname --recursive --summarize --human-readable

Second one sorts but only when provided a prefix, in my case the prefix is the folder name. My current bucket structure is as followed {accountId}/{filename}

aws s3api list-objects-v2 --max-items 10 --bucket testbucketname --prefix "30265"  --query "sort_by(Contents,&Size)"

30265 is the accountId/folder name. When the prefix isn't provided, the sort doesn't quite work.

Any help would be greatly appreciated.

This query works well for filtering the name which is a string

aws s3api list-objects --bucket testbucketname --query "Contents[?contains(Key, '.jpg')]"

Unfortunately I couldn't use contains for Size and there isn't a equals.

5
  • I would recommend using S3 inventory and query it with Athena. Commented May 9, 2022 at 14:20
  • You could also potentially use Lambda to alert you in real time to objects being uploaded with zero size (which is inherently an indication that your client sent zero bytes). Commented May 9, 2022 at 14:42
  • 2
    Do you just mean something like aws s3api list-objects-v2 --bucket example-bucket --query 'Contents[?Size==`0`].Key' (Use " instead of ' on Windows)? Commented May 9, 2022 at 14:59
  • @AnonCoward exactly that, TY! Did you want to post an answer Commented May 9, 2022 at 15:26
  • Related: stackoverflow.com/questions/72172684/… Commented May 9, 2022 at 15:33

1 Answer 1

4

You can use the --query logic to filter the list objects locally to only those that are zero-byte big:

aws s3api list-objects-v2 --bucket example-bucket --query 'Contents[?Size==`0`]'

Or, if you just want to see the list of keys without other meta-data, you can further filter the list:

aws s3api list-objects-v2 --bucket example-bucket --query 'Contents[?Size==`0`].Key'

(For both of these, replace the outer ' with " when running on Windows.)

Further, if the goal is the remove these objects, you can use jq and a subshell to construct a query that deletes the targeted objects:

aws s3api delete-objects --bucket example-bucket --delete \
"$(aws s3api list-objects-v2 --bucket example-bucket --query 'Contents[?Size==`0`].Key' |\
 jq '{"Objects": map({"Key":.})}')"

There isn't a direct way to do this same sort of construct with Windows's command interpreter.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.