1

I’m trying to route WebSocket connections deterministically to the same backend pod(in a k8s deployment) based on a query parameter (room/league id). This works with a single ingress-nginx pod, but becomes inconsistent once I scale the ingress controller (multiple nginx pods).

Topology

Kubernetes (EKS) Ingress: ingress-nginx (multiple controller/nginx pods) One Deployment exposes two ports behind one Service with two ingresses: REST (list rooms) WebSocket (/websocket) that creates/joins a room stored in-memory on the selected pod Goal

All clients connecting with the same league_id should land on the same backend pod (room state is in-memory). Current behavior

With 1 ingress-nginx pod: zero latency; creator and joiners always land on the same backend pod. With >1 ingress-nginx pods: the second user often lands on a different backend pod and connection is failed; we added client retries to “eventually” hit the right one, causing latency. Sometimes even the first user experiences latency. Minimal Ingress (WebSocket) config

Using consistent hashing by query param via upstream-hash-by. Separate Ingress for WebSockets to avoid rewrite issues.

  annotations:
    nginx.ingress.kubernetes.io/affinity: cookie
    nginx.ingress.kubernetes.io/affinity-mode: persistent
    nginx.ingress.kubernetes.io/large-client-header-buffers: 4 32k
    nginx.ingress.kubernetes.io/proxy-buffer-size: 32k
    nginx.ingress.kubernetes.io/proxy-buffering: "off"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "3950"
    nginx.ingress.kubernetes.io/proxy-http-version: "1.1"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "3950"
    nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "3950"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "3600"
    nginx.ingress.kubernetes.io/session-cookie-name: myservice-session
**    nginx.ingress.kubernetes.io/upstream-hash-by: $arg_league_id
**....

Repro

  1. User A connects: wss://api.example.com/websocket?league_id=123 → room created on Pod X.
  2. User B connects: wss://api.example.com/websocket?league_id=123 → intermittently routed to Pod Y when multiple ingress-nginx pods exist; after retries, eventually reaches Pod X.

What I’ve tried:

  • nginx.ingress.kubernetes.io/load-balance: hash
  • nginx.ingress.kubernetes.io/upstream-hash-by: $arg_league_id
  • Separate Ingress object for WebSockets (no rewrite)
  • Verified the Service exposes the correct ws port
  • Observed that single ingress replica fixes the issue; multiple replicas reintroduce inconsistency.
  • Change nginx versions
  • Change deployment into statefulset (didn't helped, it was just an nonsense attempt, i ran out of ideas)
  • removed the affinity annotation didn't work as well
  • remove cookie

Questions:

  1. Is upstream-hash-by expected to be consistent across multiple ingress-nginx replicas, or is it only deterministic within a single nginx instance?
  2. How can I guarantee identical upstream selection across all ingress-nginx pods?
  • Is there a way to enforce stable upstream peer ordering so the hash mapping matches on every ingress pod?
  • Do I need to avoid service-level load-balancing (e.g., ensure service-upstream=false) or set any specific annotation to keep hashing at the pod endpoint level?
  • Should I enable the “consistent” hash ring behavior (if supported) or use upstream-hash-by-subset annotations?
  1. If this can’t be made truly consistent at the ingress layer, is the recommended approach to externalize room state (e.g., Redis and reverse proxy for the websocket) to make another smart routing in the back?

Environment: EKS: 1.31 . ingress-nginx controller: 1.13.2 (also 1.9.5 wasn't successful). NLb . backend: nodejs .

1 Answer 1

1

nginx.ingress.kubernetes.io/upstream-hash-by is deterministic, but it only works consistently inside a single NGINX instance. Once you scale ingress-nginx to multiple controller pods, each pod builds its own upstream server list from the Kubernetes endpoints API, and the ordering of those backends is not guaranteed to be identical across replicas.

That’s why you see it “just work” with one ingress pod, but as soon as you add more, the same hash value may map to different pods depending on how each controller ordered the upstream list.


Why this happens

  • upstream-hash-by hashes the key (your league_id) → picks a backend index.

  • If ingress A has upstream list [pod1, pod2, pod3] and ingress B has [pod2, pod3, pod1], the same hash will point to different pods.

  • Kubernetes doesn’t guarantee stable ordering of endpoints, so across replicas the mapping diverges.


What you can do

Short-term workarounds

  1. Try nginx.ingress.kubernetes.io/service-upstream: "true"
    This makes ingress point to the Service ClusterIP instead of embedding every pod in the upstream block. Sometimes this produces more consistent routing, but test carefully: you lose some pod-level health awareness.

  2. Cookie/session affinity
    Works if you only need the same client to reconnect to the same pod, but won’t group multiple clients with the same league_id.

  3. Subset / consistent hashing features
    In newer ingress-nginx versions, there are annotations like upstream-hash-by-subset. They can reduce churn, but they don’t fully solve cross-replica upstream ordering issues.


Robust solution (what most production systems do):

  • Externalize the room state into Redis, DynamoDB, or similar.
    Then it doesn’t matter which backend pod a user lands on — they all pull/join state from the shared store. This completely removes the need for deterministic sticky routing and scales much better.

If you really must keep in-memory room state, your only options are:

  • Run a single ingress controller (sacrifices HA), or

  • Introduce a dedicated routing layer (Envoy/Traefik/etc.) that supports cluster-wide consistent hashing with a shared control plane.


Answers to your questions

  • Is upstream-hash-by consistent across replicas? → No, only within a single ingress pod.

  • Can you guarantee identical upstream selection? → Not with multiple ingress-nginx replicas; the endpoint list ordering is not stable.

  • Stable upstream peer ordering? → Not guaranteed by Kubernetes.

  • Should you avoid service-level load balancing? → You can try service-upstream: "true", but test carefully.

  • Enable “consistent” hash ring? → Only helps if the upstream lists are identical.

  • If not possible, what’s recommended? → Externalize room state (Redis, etc.).

Sign up to request clarification or add additional context in comments.

1 Comment

Hello there. This answer seems to have been created by generative AI, which is against StackOverflow's policy: stackoverflow.com/help/gen-ai-policy

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.