2

I have decided to use CosmosDB in order to store my users permissions data. Our permissions are very dynamic in nature and thus a non relational database makes sense.

For a permission database, it would be very heavy on the query side and low on the write side. My main query would be returning all permissions associated with a particular userId. I have thought of two potential ways of storing my data:

Option 1: Everything under 1 collection

{
    "Id": "975808fb-fdd4-47b1-9787-254eb024203e"
    "UserId":"9941",
    "UserName":"josh.lefebvre ",
    "Email":"[email protected]",
    "PasswordHash":"XXXX",
    "UserPermissions":[
        {"ProgramName":"Program1", "Permissions":[
            {"Permission":"SavePermission"},
            {"Permission":"FirmPermission", "Market": "Chicago"},
            {"Permission":"AddPermission", "Warehouse": "Dallas-Warehouse"} 
        ]},
        {"ProgramName":"Program2","Permissions":[
            {"Permission":"FirmPermission", "Market": "Chicago"}
        ]}
    ]
}

Main Drawback is that I don't know what I would use as my partitionkey. However if I had under 100k users would paritioning really make a huge difference?

Option 2: Using 2 Collections Collection 1: Users

{
    "Id": "975808fb-fdd4-47b1-9787-254eb0242031"
    "UserId":"9941",
    "UserName":"josh.lefebvre ",
    "Email":"[email protected]",
    "PasswordHash":"XXXX",
}

Collection 2: UserPermissions:

{
    "Id": "975808fb-fdd4-47b1-9787-254eb0242032"
    "UserId":"9941"
    "Permission":{
      "ProgramName":"Program2",
      "Name":"SavePermission", 
      "Market": "Mexico"
     }
},
{
    "Id": "975808fb-fdd4-47b1-9787-254eb0242033"
    "UserId":"9941"
    "Permission":{
      "ProgramName":"Program2",
      "Name":"FirmPermission", 
      "Market": "Chicago"
    }
},

Main drawback being that there may be 1000s of permissions assigned to a single user which would result in querying a lot of records. However at least I could partition my data correctly based on UserId.

My Question: In terms of the query I am trying to run, get all permissions associated with a particular user by UserId, what would be the best option in terms of performance?

4
  • 1
    Unfortunately there's really no "right" answer to this. This is something you'd need to benchmark, performance-wise, to see what the RU cost would be for such queries. As far as data modeling, again that's very broad and subjective, but do note that if you really do have thousands of permissions, you could run the risk of exceeding the size of a single document (an unbounded array issue). You'll want to consider that. Commented Nov 24, 2019 at 17:47
  • Yes a very likely possibility in my scenario. In the event that I exceed this threshold I read in the microsoft's official documentation that they recommend splitting the document. E.g. first 100 permissions in 1 document, then create a 2n document for the next 100, and so on. I'm likely going to go with option 1 knowing that we do not have a lot of users in the systems. Commented Nov 24, 2019 at 23:08
  • Doing your own document-level partitioning (e.g. splitting content across documents, 100 apiece), to me, is an anti-pattern to avoid. That type of data model will cause you to implement lots of custom logic to deal with such a thing. Commented Nov 25, 2019 at 1:38
  • I don't think it would necessarily be an anti pattern, but you are correct in that it would require some custom logic. A tradeoff I would be willing to make if it meant better throughput. Microsoft has documentation on when data should be embedded (denormalized): learn.microsoft.com/en-us/azure/cosmos-db/modeling-data Commented Nov 25, 2019 at 2:57

1 Answer 1

0

Option 3:

  • Use only 1 collection
  • Introduce property type
  • Introduce synthetic partition key property: partitionKey

Then, you have 2 levels of splitting your data: per partition key and per type.

Example

User document:

{
    "Id": "975808fb-fdd4-47b1-9787-254eb0242031"
    "UserId":"9941",
    "UserName":"josh.lefebvre ",
    "Email":"[email protected]",
    "PasswordHash":"XXXX",
    "type": "user",
    "partitionKey":"user_975808fb-fdd4-47b1-9787-254eb0242031"
}

Permissions documents:

{
    "Id": "975808fb-fdd4-47b1-9787-254eb0242032"
    "UserId":"9941"
    "Permission":{
      "ProgramName":"Program2",
      "Name":"SavePermission", 
      "Market": "Mexico"
     },
     "type": "save_permissions",
     "partitionKey":"user_975808fb-fdd4-47b1-9787-254eb0242031"
},
{
    "Id": "975808fb-fdd4-47b1-9787-254eb0242033"
    "UserId":"9941"
    "Permission":{
      "ProgramName":"Program2",
      "Name":"FirmPermission", 
      "Market": "Chicago"
    },
     "type": "firm_permissions",
     "partitionKey":"user_975808fb-fdd4-47b1-9787-254eb0242031"
}

Now you can do:

  • get me all permissions for given user (which just requires you to specify partitionKey for which you're looking permissions for)
  • get me permissions for specific user (implied by partitionKey) that are of specific type (so, a sub-group of permissions)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.