3

I'm encountering an issue in a Go application that uses the official MongoDB Go driver. We have the use case where a ReplicaSet is recreated from scratch. Replica Set consists of 2 members, no arbiter or non-SRV connection string. Even after the Replica Set is back up and healthy and there are no network issues, the Go app fails to connect/reconnect/rediscover and it appears to get stuck with the following error:

server selection error: context deadline exceeded, current topology: { Type: ReplicaSetNoPrimary, Servers: [] }

I found this thread that seems related:

When a driver has completely lost connection to a replica set, there are two possible circumstances:

  1. The replica set is still there, but there is some network interruption.
  2. All replica set nodes have moved to a totally new set of hosts and/or IPs in a short period of time and might be rediscoverable with the connection string.

Drivers could simultaneously attempt to connect to the last known MongoDB replia set and re-initialize using the connection string to see which succeeds first. However, that may not always be the best behavior for all use cases, so we have historically assumed case #1 (the more common case) and required users to implement their own recovery logic for case #2.

It seems that the MongoDB Go driver sticks somehow to a ReplicaSet, and it is not able to rediscover/reconnect if the ReplicaSet changed/was recreated. I haven't found the reason why or find any option to force to reconnect. So far, the only workaround is to restart the app, which I find a bit weird in the cloud world.

package main

import (
    "context"
    "fmt"
    "log"
    "time"

    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
)

func main() {
    const uri = "mongodb://primary:27017,secondary:27017"

    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
    defer cancel()

    clientOptions := options.Client().ApplyURI(uri)
    clientOpts.SetReplicaSet("test")
    client, err := mongo.Connect(ctx, clientOptions)
    if err != nil {
        log.Fatal(err)
    }

    defer func() {
        if err = client.Disconnect(ctx); err != nil {
            log.Fatal(err)
        }
        fmt.Println("Connection to MongoDB closed.")
    }()

    err = client.Ping(ctx, nil)
    if err != nil {
        log.Fatal("Could not connect to MongoDB:", err)
    }

    fmt.Println("Successfully connected to MongoDB")

    collection := client.Database("testdb").Collection("items")
    fmt.Println("Collection instance created:", collection.Name())
}

Any idea what could be causing this or recommended patterns to handle it? I suspect that the go driver topology stick to a particular ReplicaSet id unless it is restarted, but I could not find information in this regards.

2 Answers 2

0

const uri = "mongodb://primary:27017,secondary:27017"

this means, you do not configure replicaset for the driver at all. You should configure replicaSet connection string argument.

NOTE: when you configure replica set, you should not provide all servers explicitly in connection string. Servers will be discovered automatically based on the { hello : 1 } command response. I'm not sure whether it can give any issues or not off the top of my head, by I would recommend not to do it and provide only primary address

UPDATE:

The reason why your driver can't discover a new server is because servers list in cluster description is empty, see Servers: [] in error message.

The logging configuration should show why.

You need:

  • LogComponentServerSelection

  • LogComponentTopology (this may be useful too)

so just configure it and wait until issue will appear again.

Also, if you provide results from the mongosh shell (connected with the same connection string and at the time when issue happens) for:

db.runCommand('{ hello : 1 }')

this may help too since it will show the cluster state from another mongodb client (to confirm that actual server state at the issue moment is healthy)

UPDATE2:

ReplicaSet is recreated from scratch

i missed that you're doing it from scratch. What is a new replicaset name? How long this issue happens? In any case, please provide requested logs above and check what result you will see with the mongosh and { hello : 1 } command

Sign up to request clarification or add additional context in comments.

3 Comments

Sorry I missed it from the original code. I have just updated.
fyi you made a type in variable name. To proceed, you have to provided SDAM (server discovery) logs: mongodb.com/docs/drivers/go/current/monitoring-and-logging/…
you need LogComponentServerSelection, this maybe useful too: LogComponentTopology
-2
main

import (
    "context"
    "fmt"
    "log"
    "time"

    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
)

// URI مع تحديد اسم الـ Replica Set
const uri = "mongodb://primary:27017,secondary:27017/?replicaSet=rs0"

// عدد المحاولات وإطار الوقت بين كل محاولة
const maxRetries = 5
const retryInterval = 5 * time.Second

func connectMongo() (*mongo.Client, error) {
    clientOptions := options.Client().ApplyURI(uri)

    for attempt := 1; attempt <= maxRetries; attempt++ {
        ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
        defer cancel()

        client, err := mongo.Connect(ctx, clientOptions)
        if err != nil {
            log.Printf("Attempt %d: Connect error: %v", attempt, err)
        } else {
            // جرب Ping للتأكد من أن الاتصال ناجح
            if err := client.Ping(ctx, nil); err != nil {
                log.Printf("Attempt %d: Ping error: %v", attempt, err)
            } else {
                log.Println("Successfully connected to MongoDB")
                return client, nil
            }
            // فصل العميل قبل إعادة المحاولة
            _ = client.Disconnect(ctx)
        }

        log.Printf("Retrying in %v...", retryInterval)
        time.Sleep(retryInterval)
    }

    return nil, fmt.Errorf("could not connect to MongoDB after %d attempts", maxRetries)
}

func main() {
    client, err := connectMongo()
    if err != nil {
        log.Fatal(err)
    }

    defer func() {
        ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
        defer cancel()
        if err := client.Disconnect(ctx); err != nil {
            log.Fatal(err)
        }
        fmt.Println("Connection to MongoDB closed.")
    }()

    collection := client.Database("testdb").Collection("items")
    fmt.Println("Collection instance created:", collection.Name())
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.