132

I'm implementing functionality to track which articles a user has read.

  create_table "article", :force => true do |t|
    t.string   "title"
    t.text     "content"
  end

This is my migration so far:

create_table :user_views do |t|
  t.integer :user_id
  t.integer :article_id
end

The user_views table will always be queried to look for both columns, never only one. My question is how my index should look like. Is there a difference in the order of these tables, should there be some more options to it or whatever. My target DB is Postgres.

add_index(:user_views, [:article_id, :user_id])

Thanks.

UPDATE:
Because only one row containing the same values in both columns can exist (since in knowing if user_id HAS read article_id), should I consider the :unique option? If I'm not mistaken that means I don't have to do any checking on my own and simply make an insert every time a user visits an article.

1
  • "The user_views table will always be queried to look for both columns, never only one." -- there will never be a "find all articles that this user has viewed", or "find all users who have viewed this article" query? I find that surprising. Commented Aug 13, 2015 at 13:30

2 Answers 2

274

The order does matter in indexing.

  1. Put the most selective field first, i.e. the field that narrows down the number of rows fastest.
  2. The index will only be used insofar as you use its columns in sequence starting at the beginning. i.e. if you index on [:user_id, :article_id], you can perform a fast query on user_id or user_id AND article_id, but NOT on article_id.

Your migration add_index line should look something like this:

add_index :user_views, [:user_id, :article_id]

Question regarding 'unique' option

An easy way to do this in Rails is to use validates in your model with scoped uniqueness as follows (documentation):

validates :user, uniqueness: { scope: :article }
Sign up to request clarification or add additional context in comments.

4 Comments

The order matters enormously in indexing. Place the where clauses to the left and complete the index with the ordering columns to the right. stackoverflow.com/questions/6098616/dos-and-donts-for-indexes
Note that validates_uniqueness_of (and its cousin, validates uniqueness:) are prone to race conditions
As mentioned in the comments above and stackoverflow.com/a/1449466/5157706 and stackoverflow.com/a/22816105/5157706, consider adding unique index on the database as well.
"Put the most selective field first, i.e. the field that narrows down the number of rows fastest." This is a good heuristic. But I have a case where I have 3 fields that I can search on, but if using the second one then the first is always also provided, and similarly if using the third then the first 2 are always provided. Even though the third one technically narrows the answer the most, because it's only ever provided in the case that 2 other fields are used the index makes sense to be in the order of their prevalence of usage.
37

Just a warning about checking uniqueness at validation time vs. on index: the latter is done by database while the primer is done by the model. Since there might be several concurrent instances of a model running at the same time, the validation is subject to race conditions, which means it might fail to detect duplicates in some cases (eg. submit twice the same form at the exact same time).

4 Comments

So which one is better? Database side or validates_uniqueness_of?
Both. validates_uniqueness_of can be used to display an error message gracefully in the application for example when a form gets saved. Database constraint would make sure you don't end up with dup records even know you had validation specified in the model. Plus, you can rescue the ActiveRecord exception and also show a nice message to the user.
@W.M. If you have to pick one, go with the database constraint. This will work even if different, non RoR applications interact with your data, and ensures consistency for the long term.
Use both. validations (can) give users a better experience. constraints save the day.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.