9

I have a project I'd like to run on AWS Lambda but it is exceeding the 50MB zipped limit. Right now it is at 128MB zipped and the project folder with the virtual environment sits at 623MB and includes (top users of space):

  • scipy (~187MB)
  • pandas (~108MB)
  • numpy (~74.4MB)
  • lambda_packages (~71.4MB)

Without the virtualenv the project is <2MB. The requirements.txt is:

click==6.7
cycler==0.10.0
ecdsa==0.13
Flask==0.12.2
Flask-Cors==3.0.3
future==0.16.0
itsdangerous==0.24
Jinja2==2.10
MarkupSafe==1.0
matplotlib==2.1.2
mpmath==1.0.0
numericalunits==1.19
numpy==1.14.0
pandas==0.22.0
pycryptodome==3.4.7
pyparsing==2.2.0
python-dateutil==2.6.1
python-dotenv==0.7.1
python-jose==2.0.2
pytz==2017.3
scipy==1.0.0
six==1.11.0
sympy==1.1.1
Werkzeug==0.14.1
xlrd==1.1.0

I deploy using Zappa, so my understanding of the whole infrastructure is limited. My understanding is that some (very few) of the libraries do not get uploaded so for e.g. numpy, that part does not get uploaded and Amazon's version gets used that is already available in that environment.

I propose the following workflow (without using S3 buckets for slim_handler):

  1. delete all the files that match "test_*.py" in all packages
  2. manually tree shake scipy as I only use scipy.minimize, by deleting most of it and re-running my tests
  3. minify all the code and obfuscate using pyminifier
  4. zappa deploy

Or:

  1. run compileall to get .pyc files
  2. delete all *.py files and let zappa upload .pyc files instead
  3. zappa deploy

I've had issues with slim_handler: true, either my connection drops and the upload fails or some other error occurs and at ~25% of the upload to S3 I get Could not connect to the endpoint URL. For the purposes of this question, I'd like to get the dependencies down to manageable levels.

Nevertheless, over half a gig of dependencies with the main app being less than 2MB has to be some sort of record.

My questions are:

  1. What is the unzipped limit for AWS? Is it 250MB or 500MB?
  2. Am I on the right track with the above method for reducing package sizes?
  3. Is it possible to go a step further and use .pyz files?
  4. Are there any standard utilities out there that help with the above?
  5. Is there no tree shaking library for python?
1
  • 1
    In the end, I've ended up using the slim_handler: true option, and to get around the connectivity issues, I've bundled the whole thing on one of Amazon's VMs. Have not figured out how to slim down the whole project other than rewriting my code to not have the above as dependencies. Commented Feb 1, 2018 at 22:44

1 Answer 1

3
  1. The limit in AWS is for unpacked 250MB of code (as seen here https://hackernoon.com/exploring-the-aws-lambda-deployment-limits-9a8384b0bec3)
  2. I would suggest going for second method and compile everything. I think you should also consider using serverless framework. It does not force you to create virtualenv which is very heavy.

I've seen that all your packages can be compressed up to 83MB (just the packages).

My workaround would be:

  1. use serverless framework (consider moving from flask directly to API Gateway)
  2. install your packages locally on the same folder using:

    pip install -r requirements.txt -t .
    
  3. try your method of compiling to .pyc files, and remove others.

  4. Deploy:

    sis deploy
    

Hope it helps.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.