3

I'm using the RabbitMQ.Client nuget package to publish messages to rabbitmq from a .NET core 3.1 application. We are using the 5.1.0 version of the library.

We want to improve the resiliency of our application, so we are exploring the possibility to define a retry policy to be used when we send messages via the IModel.BasicPublish method. We are going to employ the Polly nuget package to define the retry policy.

Thw whole point of retry policies is retrying a failed operation when a failure deemed to be transient occurs. What I'm trying to understand is how to identify a transient error in this context.

Based on my understanding, all the exceptions thrown by the RabbitMQ.Client derives from the RabbitMQClientException custom exception. The point is that there are several exception types defined by the library which derives from RabbitMQClientException, see here for the full list.

I didn't find any specific documentation on that, but by reading the code on github it seems that the only custom exception thrown by the library when a message is published is AlreadyClosedException, this happens when the connection used to publish the message is actually closed. I don't think that retrying in this case makes sense: the connection is already closed, so there is no way to overcome the error by simply retrying the operation.

So my question is: what exception types should I handle in my Polly retry policy which I want to use to execute the IModel.BasicPublish call ? Put another way, which are the exception types representing transient errors thrown by IModel.BasicPublish?

12
  • 1
    Have you read this part of the documentation? Commented Sep 3, 2021 at 5:23
  • 2
    In this microsoft sample application they handle BrokerUnreachableException and SocketException. Commented Sep 3, 2021 at 5:28
  • 1
    So, this problem has two sides. One is related to the underlying connection and the other one is related to the publish operation. In the latter case the broker can send basic.nack back to the producer where you might need to republish the message by your own. In the former case the connection may or may not reestablished automatically, but as you said the pending operations might be discarded which means you need manual retry. Which one do you want to solve? Commented Sep 3, 2021 at 12:03
  • 1
    @PeterCsala after reading this documentation rabbitmq.com/confirms.html#server-sent-nacks I can confirm that I'm interested in handling publish errors related with connection troubles only. The basic.nack can be issues by the message broker too, as you pointed out, but it seems to be like a corner case. Quotes from the docs: "basic.nack will only be delivered if an internal error occurs in the Erlang process responsible for a queue." So, at least for the first iteration, I want to focus on the connection errors only. Commented Sep 3, 2021 at 12:27
  • 1
    In this case I would start with the following exceptions: BrokerUnreachableException, ConnectFailureException and OperationInterruptedException. Other exceptions do not seem to be transient one. In other word by re-publishing the same message the outcome will not change (like ProtocolVersionMismatchException) I would also capture all the RabbitMQClientException in order to analyse their frequency and distribution. Commented Sep 3, 2021 at 13:23

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.