AWS API Gateway + Lambda + SQS + EC2 + MYSQL + RedShift

Question

I am working on a project and need to lay down a new infrastructure proposal on how I want it to work with AWS.

I need to support the following:

Web administration platform
Mobile app
Reporting and analytics
3rd Party api access

In terms of load and performance, I need to support the following:

web application makes 500k + requests a month to the backend system
Mobile app will make an additional 2+ million a year
The administration platform will peak every couple of months up to 50 million requests
api access will add an addition 5+ million requests.

A few notes to help shape some of the thinking. There is little need NOW to have real time data passing back and forth however this will certainly come up in the future so hoping to plan for it now.

My thinking is the following:

Option 1: Cognito + API Gateway + Lambda + SQS + EC2 + MYSQL + Lambda + Redshift
Option 2: Cognito + API Gateway + Kinesis + EC2 + MYSQL
Option 3: Cognito + API Gateway + Lambda + dynamoDB stream + EC2 + MYSQL

So what I do need to support is a bunch of PHP end points that live on a EC2 server that will ingest JSON requests and update (CRUD) the MYSQL db. Question: based on the numbers above - am I going to run into problems having MYSQL?

Additional i was hoping the SQS would hope with some of the load and having a load balancer in front of the EC2 instances would help with the processing time etc.

A few things ive been reading is SQS is not push mechanism so you going to have to poll it every second/minute to get messages and proceed.

My aim is to achieve:

A scalable platform that can handle minimal to maximum requests without having to worry about failovers etc but also allow for flexibility in expanding the PHP application.

Any advice would help or if more clarity is needed - please let me know?

Your question is probably going to be considered by this community to be too broad or primarily opinion based. The SQS component in this question seems based on at least a partial misunderstanding of how SQS really works. SQS does push messages to consumers, as I explained in How to use AWS SQS/SNS as a push notification queue for heavy processing tasks via PHP? The consumers do not need to continually/periodically ask SQS "do you have messages now? ...now? ...what about now?" in the sense you probably envision. That would be terrible. — Michael - sqlbot
– Michael - sqlbot, Commented Jun 29, 2017 at 0:30
Thanks! I am looking for help here as you can see ive sort of mixed up in numerous ways to skin the cat. I could just load up the api gateway - run EC2 instances with php endpoints and put in a load balancer and that should do it .... but i want to ensure that im building this right. As a mobile app will be involved and if I ever want to push the POST api calls to a CRM at some point - is that easily done with no re-work! — JasonWigs
– JasonWigs, Commented Jun 29, 2017 at 1:03

Chris White · Accepted Answer · 2017-06-29 01:26:31Z

Application Considerations

There is little need NOW to have real time data passing back and forth however this will certainly come up in the future so hoping to plan for it now

It's difficult to really comment on this without knowing what kind of realtime data you're expecting to pass back and forth

So what I do need to support is a bunch of PHP end points that live on a EC2 server that will ingest JSON requests and update (CRUD) the MYSQL db

There's a couple of ways to go with this. Offloading I'll talk on in a bit. However if you're just taking reasonably sized JSON data and dumping it into a DB, I would recommend considering Lambda for that instead. Do note however that PHP is currently not a supported lambda language, though technically there are ways to get it on Lambda I wouldn't recommend it.

DB Considerations

Question: based on the numbers above - am I going to run into problems having MYSQL

For the data part there are also a number of ways to go with this. A high number of writes I would say would be your biggest concern depending on how much you go with that. For RDS, AWS offers you provisioned IOPS. DynamoDB gives you write capacity.

For a read heavy type setup. RDS offers you read replicas. DynamoDB can get you some pretty nice read performance, at the cost of having to deal with un-normalized data. Along with this DynamoDB recently had DAX, a caching system implemented. For general caching Elasticache can be utilized.

Regarding your engine question it comes down to requirements. I'd recommend reading up on the AWS best practices for RDS to see which engine best meets both durability and performance needs. Outside of the engine, you should of course be monitoring query performance to catch issues such as frequently accessed non indexed columns.

Compute Offload: SQS

Additional i was hoping the SQS would hope with some of the load and having a load balancer in front of the EC2 instances would help with the processing time etc.

A few things ive been reading is SQS is not push mechanism so you going to have to poll it every second/minute to get messages and proceed.

CloudWatch offers you alarms which can be set for X number of messages. Then it can spawn off a work coordinator Lambda to pass off to other Lambdas depending on how much work you're putting into the queue. Another option is to have Lambda spawn an EC2 worker instance which simply pulls down everything from the queue and passes it off to Lambdas or other EC2 instances.

Compute Offload: Batch Processing

Another option is to utilize AWS batch. How it works is you have compute environments, job queues, and job definitions.

A compute environment is what holds containers which will run the jobs. This can either be on-demand instances if you don't want to think about allocation too much, or spot-instances if you want to go the cost effective route at the risk of being outbid. Another setting here is the minimum and maximum amount of vCPUs you want available. AWS Batch will increase and decrease resources based on these limits and the number/size of the jobs.

From there are job queues. These have priorities attached to them, and are bound to a compute environment. The same compute environment can hold multiple job queues. The idea is it lets you attach importance to your processes. A process around customer orders is something you'd want to get done a lot faster than say a process that's just generating thumbnails.

Finally there is a job definition. This indicates what portion of your entire compute environment is needed to run through a piece of work. This is how you determine the size of jobs. It's important to remember that if you're too greedy with job capacity allocation you'll get stuck jobs waiting for resources to get freed. For scaling, you either adjust compute environment min/max resources, or consider breaking out work into a separate queue with a dedicated compute environment for that queue.

As for getting jobs started, you can either have a lambda submit a job to the batch based on certain events, or have your application use the SDK to submit the job.

EC2 Instance Considerations

With regards to the EC2 instances if you do need to go that route, I'd recommend putting a load balancer in front of them. If you know you're going to have spike traffic, AWS can do what's called pre-warming. This in essence gives you a reserve capacity to handle the expected traffic load. Also it allows you to attach an SSL cert to it which handles HTTPS for all the EC2 instances Not only that but it can terminate SSL so that your EC2 instances only receive HTTP, make the processing lighter. I'd also recommend looking over the ELB best practices as well.

You should also consider auto scaling groups. This lets you add and remove capacity depending on demand. Note however that since it needs to spend time spinning instances up, it will not immediately fix unusual spike traffic immediately. ASGs also do health checks on your instances and will replace instances which fail health checks. Note that ELB also does health checks and will pull an instance out of rotation if it notices a bad instance.

Next you should evaluate reserved instances. If you're going to constantly have particular EC2 instances up they save you more over time versus on demand. Just be aware that the best savings add up when you purchase up front in 1-3 year commitments. At the same time, it gives you guaranteed capacity. This is especially important for cases where an AZ fails and demand increases in other AZs in the region. A reserved instance will always overrule an on-demand instance when such high demand occurs.

CDN Static Asset Caching

For high volume on static assets consider CloudFront. The easiest way is to setup an S3 bucket in us-east-1 (yes, you have to do this because that's where CloudFront's deployment infra lives). Then load your assets into the S3 bucket, make it a static website host, then point CloudFront to it. CloudFront will then deploy your static assets to edge servers in the regions you specify. By default CloudFront keeps these static assets for 24 hours before checking to see if anything has changed.

Note that you really should have CloudFront assets as a subdomain so that it only handles requests for static content. If you have it handle the entire domain, you're adding yet another route for dynamic content traffic to go through.

Monitoring Considerations

To close out, in all cases you want to setup metrics using CloudWatch for the services you utilize. Don't just setup the infra and assume everything is okay. You always need to be making sure things are running smoothly, and act upon bottlenecks as soon as possible. Also I would recommend looking over all the AWS services, and keep thinking about what parts of your app can be offloaded to such services. Knowledge is power.

Nitin Bisht · Accepted Answer · 2020-03-04 08:47:47Z

Ways To Fix AWS S3 Bucket Problems:

Off late, Twitter has been bombarded with the solution to fix AWS's S3 bucket problem. This is something that has got countless professionals working on the cloud thinking about the fact that 'Are there any further improvements that can be made on AWS?'

Hands down AWS has brought the cloud computing industry to an all-new height. Where it has improved the S3 bucket security, however, we still hear and witness breaches that have been an alarming factor and has got us all thinking about ‘something still is needed’

This has got many professionals working on the cloud fired-up to figure out how to fix this issue. And keeping this in mind I have compiled some of the best practices that can get this right. These include: -

Decoupling the public accesses from the bucket: If an AWS professional cannot make a bucket public, the problem disappears. However, you should wait until it starts off by letting the bucket public itself. It was found that there were edge cases where public buckets are needed, thus it was suggested a new AWS service, that was AWS Bucket Exposing Endpoints. This will create a separate public endpoint for an existing S3 bucket (CloudFront i.e. without the extra CDN components). This service has different permissions, pages in the UI, and lies in a different namespace.
Merge ACLs with the bucket policies: When developers are confused they end up making mistakes and in the AWS ecosystem there are very few errors/ mistakes because there are very few things explaining the inner-workings of S3 bucket policies and ACLs. If an AWS associate was asked to take a vote most of them would agree that getting ACLs off the bucket can be a huge help since it seems to work in a different way when getting permission from it is concerned.
CLI-only setting on the public bucket: In my opinion, if the first point cannot be brought in action, then the next best alternative is to entirely remove the public access setting from the UI. It can be drastic unless it is not done without the precedent.
Allow the account owners to make the decisions: In my opinion, the authority to set things should not be given to any developer especially when permissions for S3 is concerned. Furthermore, to add in this, any new account launched by Amazon, they can globally add Prevent S3 buckets from being public. This will require the end-users to disable it and add MFA devices before it has been disabled.
Two-person developer to be required: S3 permissions are some serious work. Now accounts having more than one developer where anyone of those developers attempts to make the account public an email confirmation should be shared with the admins to validate this action.

In the end, I would say that I think AWS is doing a commendable job where it has been creating some good services that have excellent security options. Where the answer to developers configuring unsecured buckets goes to show that they aren’t reading the options, instead, they are worried about making it more refined.

Srini Sydney · Accepted Answer · 2017-06-29 02:20:53Z

-1

If you intend to use securely then option 2 and 3 are not feasible

Option 2 has kinesis whic is unencrypted. Hence unsecure Option 3 has DynamoDB Streams is not locked down to a Virtual Private Cloud endpoint

edited Jun 29, 2017 at 2:20

answered Jun 28, 2017 at 22:31

Srini Sydney

5888 silver badges17 bronze badges

Collectives™ on Stack Overflow

AWS API Gateway + Lambda + SQS + EC2 + MYSQL + RedShift

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related