4

I would like to know if there is a rule of thumb about when to use a new document and when to use a sub document. In sql database I used to break all realtions to seperate tables by the rule of normalization and connect them with keys , but I can't find a good approch about what to do in mongodb ( I don't know how other no-sql databases are handled). Any help will be appreicated. Kind regards.

1 Answer 1

7

Though no fixed rules, there are some general guidelines which are intuitive enough to follow while modeling data in noSql.

  • Nearly all cases of 1-1 can be handled with sub-documents. For example: A user has an address. All likelihood is that address would be unique for each user (in context of your system, say a social website). So, keeping address in another collection would be a waste of space and queries. Address sub-document is the best choice.

  • Another example: Hundreds of employees share a same building/address. In this case keeping 1-1 is a poor use of space and will cost you a lot of updates whenever a slight change happens in any of the addresses because it's being replicated across multiple employee documents as sub-document. Therefore, an address should have many employees i.e. 1 to many relationship

You must have noticed that in noSql there are multiple ways to represent 1 to many relationship.

  1. Keep an array of references. Choose this if you're sure the size of the array won't get too big and preferably the document containing the array is not expected to be updated a lot.
  2. Keep an array of sub-documents. A handy option if the sub-documents don't qualify for a separate collection and you don't run the risk of hitting 16Mb document size limit. (thanks greyfairer for reminding!)
  3. Sql style foreign key. Use this if 1 and 2 are not good enough or you prefer this style over them

Modeling documents for retrieval, Document design considerations and Modeling Relationships from Couchbase (another noSql database) are really good reads and equally applicable to mongodb.

Sign up to request clarification or add additional context in comments.

3 Comments

In the 1 to many case, you forgot to mention you can use an array of sub-documents also, instead of an array of references. I would prefer to do this if it's a 'composition' relation. Except of course when there's a risk of hitting the 16MB document limit.
First of all, thank you very much for your help :-). I read the links that you posted and it was very helpfull. In my app I have a entity name 'Song'. song can have likes and views. as for today, song contains 2 arrays (view and likes) of user refernces. If I understand correctly, since this array can be very large, and change very rapidly, it is better to create a new collections for views and likes with the user refernce and the song reference and put indexex on both of them?
I'd go with views and likes collection, as you said. It's because you never know how many of them a song will get and keeping user references is a requirement. You should also keep view_count, and like_count fields in song document and increment them whenever a new view or like doc is created. This way you don't need a separate query to get just the count.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.