Error in reading .gz file in python using gzip

Ask Question

Asked 1 year, 8 months ago

Modified 1 year, 8 months ago

Viewed 71 times

Part of AWS Collective

I am using code to read file from AWS s3 bucket using boto3, throwing error "Not a gzipped file (b'\xef\xbb')" even if filename is ends with .gz and the content type is also application/x-gzip, it throwing erorr to uncompress the input file how can i handle this.

import gzip

job.file_bucket = 'upload-bucket'
job.file_path = 'filepath/filename.gz'
s3client = AwsS3Service.get_client(True)
file_object = s3client.Object(job.file_bucket, job.file_path)

job.file_object = file_object.get()['Body']
content_type = file_object.get()["ContentType"]
job.cFilePath = f"{job.file_bucket}/{job.file_path}"

if content_type == 'application/x-gzip' or (content_type == 'binary/octet-stream' and     job.cFilePath.endswith('.gz')):
    with gzip.open(job.file_object, 'rb') as file:
        xml_data = file.read()

edited Mar 13, 2024 at 11:39

John Rotenstein

273k28 gold badges456 silver badges541 bronze badges

asked Mar 13, 2024 at 7:24

Pranav Kokate

11 bronze badge

I suspect file_object.get()['Body'] is not getting what you want. Of course there is the question, are you sure the file is named correctly and is in fact a gzip file. Which library are you using to access the AwsS3Service? Instead of using gzip, you could try just writing the file to disk, and opening it with something else?

matt
– matt

2024-03-13 08:18:32 +00:00
Commented Mar 13, 2024 at 8:18
Hi @matt, Thanks for replying. I am using boto3 to get file from s3 bucket and using gzip to uncompressend gz file. File name is ends with .gz and if i extract content of file it is also showing application/x-gzip

Pranav Kokate
– Pranav Kokate

2024-03-13 08:39:45 +00:00
Commented Mar 13, 2024 at 8:39
I would try this simple download example, boto3.amazonaws.com/v1/documentation/api/latest/guide/… and check to make sure the file is OK and a gzip file after you've saved it.

matt
– matt

2024-03-13 08:45:00 +00:00
Commented Mar 13, 2024 at 8:45
1

gzip is a pretty simple interface. You seem to be providing it something that isn't a .gz file. I would try saving it and checking if it works. Then if that works, you're doing something else wrong.

matt
– matt

2024-03-13 10:27:22 +00:00
Commented Mar 13, 2024 at 10:27
1

gzip.open takes a filename, not file object. If you want to decompress a file object, use the GzipFile class.

Anon Coward
– Anon Coward

2024-03-13 14:47:37 +00:00
Commented Mar 13, 2024 at 14:47

| Show 2 more comments

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Error in reading .gz file in python using gzip

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest