AWS Load Balancer 502

Question

I have microservices(in different programming languages) running on an EC2 instance. On production I notice a few 502 Bad Gateway Errors when these services try to interact with each other. Also in the logs of the requested service it doesn't show any api call is being hit

example service A calls service B, but in service B logs there is nothing to indicate that a call came from service A.

Can it be AWS load balancer issue? Any help would be appreciated. Thanks in advance.

Solution tried: We tried making http/https connection agents in each service but still we get this issue.

Update: In lb logs, the api is logged, but the target response code shows "-" whereas lb response code shows 502 or 504. Does it mean that lb is not able to handle the traffic or my application?

Also what can be the possible solution?

You can enable lb logs , if traffic passes through it in correct ways you will be able to see output or post logs here — Kush Vyas
– Kush Vyas, Commented Nov 2, 2017 at 9:35
In lb logs, the api is logged, but the target response code shows "-" whereas lb response code shows 502 or 504. Does it mean that lb is not able to handle the traffic or my application? @KushVyas — rajat12a
– rajat12a, Commented Jan 9, 2018 at 12:25
@Root We have exactly the same problem. Do you still have it, or did you find a solution? — Jan Dörrenhaus
– Jan Dörrenhaus, Commented Apr 24, 2018 at 14:37

Jan Dörrenhaus · Accepted Answer · 2018-05-04 11:56:24Z

31

We had the same problem.

In our setup, an AWS Application ELB has a target group of 4 EC2 instances. On each of the EC2 instances, there is an Apache2 which forwards to a Tomcat.

The ELB has a default connection KeepAlive of 60 seconds. Apache2 has a default connection KeepAlive of 5 seconds. If the 5 seconds are over, the Apache2 closes its connection and resets the connection with the ELB. However, if a request comes in at precisely the right time, the ELB will accept it, decide which host to forward it to, and in that moment, the Apache closes the connection. This will result in said 502 error code.

The solution is: When you have cascading proxies/LBs, either align their KeepAlive timeouts, or - preferrably - even make them a little longer the further down the line you get.

We set the ELB timeout to 60 seconds and the Apache2 timeout to 120 seconds. Problem gone.

answered May 4, 2018 at 11:56

Jan Dörrenhaus

6,7172 gold badges36 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

rajat12a Over a year ago

We figured the issue in our system It was due to the immediate shutdown of ec2 instances, instead of waiting for draining period We already had elb set to 60 seconds and apache at 120seconds

Naga Over a year ago

We are having same issue currently, when this case happen, can we see any log on Apache side?

Jan Dörrenhaus Over a year ago

@Naga We didn't, no. Because the Apache does not notice anything being wrong. The ELB access logs show the request with the 502 status code, and the Apache access logs show nothing.

Naga Over a year ago

@Jan thank you for the information! actually it’s also the same. I checked apache access log and error log, but I could not find anything... we will try the same setting as you and see how.

aknosis Over a year ago

This was so difficult to figure out - thanks for this Q/A. This resolved my problem as soon as I increased the KeepAliveTimeout

|

Scott Krager · Accepted Answer · 2021-12-04 05:02:01Z

1

Health checks use HTTP2. I got my EC2 instances running NGINX to healthy by adding http2 to the listen 80.

listen 80 default_server http2;

answered Dec 4, 2021 at 5:02

Scott Krager

111 bronze badge

Collectives™ on Stack Overflow

AWS Load Balancer 502

2 Answers 2

7 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related