Mongodb and persistent data

Question

With respect to the fact that Mongo is a NoSql database and taking account of the heavy reliance of NoSql Dbs on RAM, I have been wondering what would happen in the following scenario?

Assuming that I have MongoDb installed in a server, and I am recording payments in a document. For instance:

 {
     UserId: "X-123456",
     //Rest of user data,
     Payments: [
        {
            TransactionId: "X-123456"
            //Rest of payment data
        }
     ]
 }

When a user makes a payment, and the server receives the successful payment response, in about a few seconds after the response has been added to the document, the power goes out and the server shuts down. For instance:

 1- Response received at 04.01.01.100
 2- Response added to Mongo Document at 04.01.01.300
 3- Power goes out at 04.01.05.00

What happens to the data in this case? Will it be still available in the User document?

data loss is always a possibility when server not shut down properly. Corruption is even a possibility. But, Mongo queues writes in a single thread so it depends if it was caught up. After a few seconds, probably. — Geoduck
– Geoduck, Commented Apr 16, 2018 at 6:12
@GarrGodfrey: That's true, but I want to know, how long exactly does it take for mongo to make the data persistent in the actual disk? — Arnold Zahrneinder
– Arnold Zahrneinder, Commented Apr 16, 2018 at 6:31
" ... reliance of NoSql Dbs on RAM ..." Last time I checked every top end SQL Database "heavily relies on RAM", and it's kind of inherent to databases in general, since RAM is fast. That said, they all ( including MongoDB ) have a concept of "checkpointing" where data is persisted to disk. If you are asking about the "internals", then it's not really a programming question and basically off-topic for Stack Overflow. From a programmers point of view, you should not really care and just accept that it does it. Anything else wanders into the territory of server administration. — Neil Lunn
– Neil Lunn, Commented Apr 16, 2018 at 7:39
From the pure perpective of "four minutes later", then MongoDB has actually persisted to disk by that timeframe. There are things like checkpoints that occur approx every 60 seconds, and journaling which is even more frequent. So it's pretty often just in case you are confusing "utilizing RAM" as compared to "using exclusively", which is of course not what reliable systems do. System Config propeller-heads can read about these in the storage engine documentation: docs.mongodb.com/manual/core/wiredtiger — Neil Lunn
– Neil Lunn, Commented Apr 16, 2018 at 7:46

Vince Bowdren · Accepted Answer · 2018-04-16 11:04:17Z

Short answer: Yes, the data is still available.

Long answer intro: Yes; On the one hand, MongoDB's sophisticated journaling system means that the power would have to go out within 50ms for the data to be lost. And on the other hand, if you use a higher writeConcern then you can make sure the data is never lost.

Here's what goes on with the journaling:

Every change (which might affect none, one, or many documents) going into the MongoDB database is first changed into a sequence of single-document changes, called the journal. This is stored first in a RAM buffer, but that buffer is written to disk every 50ms. That means that:

If the power goes out within 50ms of your data being written to the journal buffer, the data is lost.
After 50ms, the journal has been written to disk. The server writes all those journal entries into the database proper in batches, about every 60s; so for up to 60s your new data is in the journal, but not yet in the database proper. However, even if the power goes out now, it's fine. The server, when it restarts, will write all those journal entries into the database - i.e. your data has not been lost.

That 50ms window means that there's a small but non-zero risk of losing data.

Here's what goes on with the writeConcern:

That risk can be completely eliminated by using a suitable writeConcern such as j:true. That means that, when your update is first received from the client, the server does not send an acknowledgement back to the client until the data has been written to the on-disk journal. That means that, once your client gets a positive acknowledgement from the server, then the data is guaranteed safe.

Clement Amarnath · Accepted Answer · 2018-04-16 07:57:45Z

1- Response received at 04.01.01.100
2- Response added to Mongo Document at 04.01.01.300
3- Power goes out at 04.01.05.00
What happens to the data in this case? Will it be still available in the User document?

When you use the appropriate WriteConcern then mongodb acknowledges that the data has been stored in the documents. In your case the data will be still available in the User document.

Use w: 1 on the WriteConcern this option Requests acknowledgement that the write operation has propagated to the standalone mongodb or the primary in a replica set. w: 1 is the default write concern for MongoDB.

MongoDB stands at CP on the CAP(Consistency Availability Partitioning) theorem, by having Consistency MongoDB ensures that the data which we have saved is given back to us when we request it/query it. More Info - Mongodb ACID and CAP theorem

MongoDB a NoSQL DB and has a good capacity in writing more number of documents in a given second - A sample statistics on insertion

Please note that write time of a document in MongoDB depends on the

Size of the document(max size 16MB per document),
WriteConcern we have specified,
the number of indexes on the collection which need to be updated on the insertion of document and
the number of Replica Sets(A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability)

Collectives™ on Stack Overflow

Mongodb and persistent data

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related