1

Hi I am unable to upload a file to S3 using boto. It fails with the following error message. Can someone help me, i am new to python and boto.

from boto.s3 import connect_to_region
from boto.s3.connection import Location
from boto.s3.key import Key
import boto
import gzip
import os

AWS_KEY = ''
AWS_SECRET_KEY = ''
BUCKET_NAME = 'mybucketname'

conn = connect_to_region(Location.USWest2,aws_access_key_id = AWS_KEY,
        aws_secret_access_key = AWS_SECRET_KEY,
        is_secure=False,debug = 2
        )

bucket = conn.lookup(BUCKET_NAME)
bucket2 = conn.lookup('unzipped-data')
rs = bucket.list()
rs2 = bucket2.list()

compressed_files = []
all_files = []
files_to_download = []
downloaded_files = []
path = "~/tmp/"

# Check if the file has already been decompressed

def filecheck():
    for filename in bucket.list():
        all_files.append(filename.name)

    for n in rs2:
        compressed_files.append(n.name)
    for file_name in all_files:
            if file_name.strip('.gz') in compressed_files:
                pass;
            elif '.gz' in file_name and 'indeed' in file_name:
                files_to_download.append(file_name)


# Download necessary files                
def download_files():
    for name in rs:
        if name.name in files_to_download:  
            file_name = name.name.split('/')

            print('Downloading: '+ name.name).strip('\n')
            file_name = name.name.split('/')
            name.get_contents_to_filename(path+file_name[-1])
            print(' - Completed')

            # Decompressing the file
            print('Decompressing: '+ name.name).strip('\n')
            inF = gzip.open(path+file_name[-1], 'rb')
            outF = open(path+file_name[-1].strip('.gz'), 'wb')
            for line in inF:
                outF.write(line)
            inF.close()
            outF.close()
            print(' - Completed')

            # Uploading file
            print('Uploading: '+name.name).strip('\n')
            full_key_name = name.name.strip('.gz')
            k = Key(bucket2)
            k.key = full_key_name
            k.set_contents_from_filename(path+file_name[-1].strip('.gz'))
            print('Completed') 

            # Clean Up
            d_list = os.listdir(path)
            for d in d_list:
                os.remove(path+d)


# Function Calls             
filecheck()
download_files()

Error message :

Traceback (most recent call last):
  File "C:\Users\Siddartha.Reddy\workspace\boto-test\com\salesify\sid\decompress_s3.py", line 86, in <module>
    download_files()
  File "C:\Users\Siddartha.Reddy\workspace\boto-test\com\salesify\sid\decompress_s3.py", line 75, in download_files
    k.set_contents_from_filename(path+file_name[-1].strip('.gz'))
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 1362, in set_contents_from_filename
    encrypt_key=encrypt_key)
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 1293, in set_contents_from_file
    chunked_transfer=chunked_transfer, size=size)
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 750, in send_file
    chunked_transfer=chunked_transfer, size=size)
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 951, in _send_file_internal
    query_args=query_args
  File "C:\Python27\lib\site-packages\boto\s3\connection.py", line 664, in make_request
    retry_handler=retry_handler
  File "C:\Python27\lib\site-packages\boto\connection.py", line 1070, in make_request
    retry_handler=retry_handler)
  File "C:\Python27\lib\site-packages\boto\connection.py", line 1029, in _mexe
    raise ex
socket.error: [Errno 10053] An established connection was aborted by the software in your host machine

I have no problem downloading the files, but the upload fails for some weird reason.

2
  • 1
    Is there any chance you have some sort of firewall or anti-virus program running that is interfering with the upload? Commented Jan 23, 2015 at 21:08
  • @garnaat .. I had no problems uploading small files which are around 1 ~ 2 gigs in size. this error occurred while uploading filesize 6 gigs. So is there something to do with the size of files Commented Jan 23, 2015 at 21:14

1 Answer 1

3

If the problem is the size of files (> 5GB), you should use multipart upload:

http://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html

search for multipart_upload in the docs: http://boto.readthedocs.org/en/latest/ref/s3.html#module-boto.s3.multipart

Also, see this question for a related issue:

How can I copy files bigger than 5 GB in Amazon S3?

The process is a little non-intuitive. You need to:

  • run initiate_multipart_upload(), storing the returned object
  • split the file into chunks (either on disk, or read from memory using CStringIO)
  • feed the parts sequentially into upload_part_from_file()
  • run complete_upload() on the stored object
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.