2

I’m working on a FastAPI service deployed with Uvicorn using multiple workers to handle voice communication with Twilio, and I’m running into a routing problem.

Current architecture:

  1. A client sends a POST request to the FastAPI endpoint. <POST /call>
  2. The API publishes a message to RabbitMQ queue, which is consumed later by the consumer.
  3. The consumer, that's running on the same app, picks up the message and starts processing the message to create the call with Twilio.
  4. The consumer initiates a Twilio call, dynamically generating a webhook URL like /webhook/{task_id} to handle the callback.
  5. Twilio sends a callback to that URL, which is then processed and forwarded via WebSocket. [...] All other processes are done within the WS processing.

Problem: Because the app runs with multiple Uvicorn workers, Twilio’s callback (webhook) may hit any worker, not necessarily the one that initiated the call or "preloaded" the resources.

I would like to pre-initialize external resources (e.g., OpenAI, TTS, STT models etc) in specific workers before the call starts and then ensure that Twilio’s webhook is routed to the same worker that initiated the session. Basically, warm up or pre-instantiate resources per worker to receive the call.


Is there a practical or recommended way to ensure webhook requests are consistently routed to the same Uvicorn/FastAPI worker that initiated the call?

I've considered some load balancer level solutions, or some external caching using Redis between the workers.

1 Answer 1

1

You have N workers with names 1 .. N, and they’re not interchangeable once a workflow has begun. So send work pieces to specific named workers.

At the very beginning, client asks a load balancing service to assign an available worker. Thereafter, the queue channel identifiers and the URL structure will explicitly name the single worker that is eligible to handle that item of processing. Other possibilities include DB queries, mapping unique TCP ports to individual workers, and waiting on a Redis key that mentions the worker name.


For the initial interaction you can use uvicorn in the usual way, load balancing across a bunch of anonymous interchangeable workers. But on subsequent steps in the workflow, you have explained that you cannot use uvicorn's load balancing functionality. If you really want to have workers running under uvicorn, then you might choose to configure N uvicorns, each with a single worker. Let the first one listen on TCP port 8001, the second on 8002, up through worker N. Then the URL you send a given step to will address the worker that is relevant for that step in the workflow.

It's not clear that running under uvicorn is a big win here, even if you've already written some code that expects that environment. A more natural way to send a step to a worker would be via Kafka pub-sub or via Valkey, using N channel names or N redis keys. Some folks like to schedule tasks with celery. Or you might use DJB's daemontools to manage workers listening for requests on ports 8001 .. 800N, or polling for their tasks.

Most folks adopt a different design to avoid getting into the situation you're in. They will typically serialize what is needed into a central store like Valkey or an RDBMS, and then passing around a guid is enough for any anonymous worker to lookup the guid and deserialize the state needed for the next step. This can be combined with having the load balancer send requests to hash(guid) % N, which has the effect of usually making subsequent steps go to same worker, so we can cheaply verify cache hit and avoid doing a re-lookup, deserializing, and rebuilding needed state. Downside is that the designated worker may already be busy with another task that hashed to him, so our P95 latency goes up. And of course when we go from, say, 5 to 4 workers, and we hash modulo 4, it will take a moment to adjust before we start seing a good cache hit ratio again.

Sign up to request clarification or add additional context in comments.

2 Comments

In some parts, it makes sense, but the main problem is that uvicorn workers are independent processes, there's no a built-in way to address them individually
It's common to have uvicorn load balance across anonymous worker processes. But that's not your Use Case, which is why I suggested "mapping unique TCP ports to individual workers".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.