I am using code to read file from AWS s3 bucket using boto3, throwing error "Not a gzipped file (b'\xef\xbb')" even if filename is ends with .gz and the content type is also application/x-gzip, it throwing erorr to uncompress the input file how can i handle this.
import gzip
job.file_bucket = 'upload-bucket'
job.file_path = 'filepath/filename.gz'
s3client = AwsS3Service.get_client(True)
file_object = s3client.Object(job.file_bucket, job.file_path)
job.file_object = file_object.get()['Body']
content_type = file_object.get()["ContentType"]
job.cFilePath = f"{job.file_bucket}/{job.file_path}"
if content_type == 'application/x-gzip' or (content_type == 'binary/octet-stream' and job.cFilePath.endswith('.gz')):
with gzip.open(job.file_object, 'rb') as file:
xml_data = file.read()
file_object.get()['Body']is not getting what you want. Of course there is the question, are you sure the file is named correctly and is in fact a gzip file. Which library are you using to access the AwsS3Service? Instead of using gzip, you could try just writing the file to disk, and opening it with something else?gzip.opentakes a filename, not file object. If you want to decompress a file object, use the GzipFile class.