I’m working on a FastAPI service deployed with Uvicorn using multiple workers to handle voice communication with Twilio, and I’m running into a routing problem.
Current architecture:
- A client sends a POST request to the FastAPI endpoint. <POST /call>
- The API publishes a message to RabbitMQ queue, which is consumed later by the consumer.
- The consumer, that's running on the same app, picks up the message and starts processing the message to create the call with Twilio.
- The consumer initiates a Twilio call, dynamically generating a webhook URL like /webhook/{task_id} to handle the callback.
- Twilio sends a callback to that URL, which is then processed and forwarded via WebSocket. [...] All other processes are done within the WS processing.
Problem: Because the app runs with multiple Uvicorn workers, Twilio’s callback (webhook) may hit any worker, not necessarily the one that initiated the call or "preloaded" the resources.
I would like to pre-initialize external resources (e.g., OpenAI, TTS, STT models etc) in specific workers before the call starts and then ensure that Twilio’s webhook is routed to the same worker that initiated the session. Basically, warm up or pre-instantiate resources per worker to receive the call.
Is there a practical or recommended way to ensure webhook requests are consistently routed to the same Uvicorn/FastAPI worker that initiated the call?
I've considered some load balancer level solutions, or some external caching using Redis between the workers.